Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with pip install #43

Open
bfeif opened this issue Oct 24, 2023 · 11 comments
Open

Error with pip install #43

bfeif opened this issue Oct 24, 2023 · 11 comments

Comments

@bfeif
Copy link

bfeif commented Oct 24, 2023

See full traceback:

$ pip install hebpipe
Collecting hebpipe
  Using cached hebpipe-3.0.0.6.tar.gz (8.6 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting requests (from hebpipe)
  Using cached requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)
Requirement already satisfied: numpy in ./.venv/lib/python3.11/site-packages (from hebpipe) (1.26.1)
Collecting gensim==3.8.3 (from hebpipe)
  Using cached gensim-3.8.3.tar.gz (23.4 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [60 lines of output]
      running dist_info
      creating /private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-modern-metadata-87bepodf/gensim.egg-info
      writing /private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-modern-metadata-87bepodf/gensim.egg-info/PKG-INFO
      writing dependency_links to /private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-modern-metadata-87bepodf/gensim.egg-info/dependency_links.txt
      writing requirements to /private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-modern-metadata-87bepodf/gensim.egg-info/requires.txt
      writing top-level names to /private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-modern-metadata-87bepodf/gensim.egg-info/top_level.txt
      writing manifest file '/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-modern-metadata-87bepodf/gensim.egg-info/SOURCES.txt'
      Traceback (most recent call last):
        File "/Users/benfeifke/code/spark-testing/.venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/Users/benfeifke/code/spark-testing/.venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/benfeifke/code/spark-testing/.venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 149, in prepare_metadata_for_build_wheel
          return hook(metadata_directory, config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 396, in prepare_metadata_for_build_wheel
          self.run_setup()
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 507, in run_setup
          super(_BuildMetaLegacyBackend, self).run_setup(setup_script=setup_script)
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 341, in run_setup
          exec(code, locals())
        File "<string>", line 367, in <module>
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/__init__.py", line 103, in setup
          return distutils.core.setup(**attrs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 185, in setup
          return run_commands(dist)
                 ^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
          dist.run_commands()
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 989, in run_command
          super().run_command(command)
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/command/dist_info.py", line 107, in run
          self.egg_info.run()
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/command/egg_info.py", line 318, in run
          self.find_sources()
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/command/egg_info.py", line 326, in find_sources
          mm.run()
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/command/egg_info.py", line 548, in run
          self.add_defaults()
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/command/egg_info.py", line 586, in add_defaults
          sdist.add_defaults(self)
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/command/sdist.py", line 113, in add_defaults
          super().add_defaults()
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/sdist.py", line 251, in add_defaults
          self._add_defaults_ext()
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/sdist.py", line 335, in _add_defaults_ext
          build_ext = self.get_finalized_command('build_ext')
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 305, in get_finalized_command
          cmd_obj.ensure_finalized()
        File "/private/var/folders/mp/m_zk6npd02n2mk4p8vt135480000gn/T/pip-build-env-gp_ibbw8/overlay/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 111, in ensure_finalized
          self.finalize_options()
        File "<string>", line 111, in finalize_options
      AttributeError: 'dict' object has no attribute '__NUMPY_SETUP__'
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

For reference, I'm using pip 23.3.1 and python 3.11.5.

@amir-zeldes
Copy link
Owner

Thanks for reporting. This looks like an issue with gensim/numpy compatibility. I'm finding it documented here:

piskvorky/gensim#3225

You could try downgrading numpy, or upgrading gensim beyond the recommended 3.8.3. Does either of those work?

@bfeif
Copy link
Author

bfeif commented Nov 7, 2023

These packages/versions worked for me:

conllu==4.5.3
depedit==3.3.0.0
diaparser==1.1.0
flair==0.13.0
gensim==4.3.2
joblib==1.3.2
numpy==1.26.1
pandas==2.1.2
protobuf==4.25.0
requests==2.31.0
rftokenizer==2.0.1
scipy==1.11.3
stanza==1.6.1
torch==2.1.0
transformers==4.35.0
xgboost==0.81
xmltodict==0.13.0

I tried to open a PR but didn't have access.

@amir-zeldes
Copy link
Owner

Interesting - what Python version does that combination work on?

@amir-zeldes
Copy link
Owner

I can't get those version numbers to run on Python 3.11 for example.

@bfeif
Copy link
Author

bfeif commented Nov 14, 2023

I'm using python 3.11.5... 🤔

@amir-zeldes
Copy link
Owner

Hm, I tested with Python 3.11.2, but those library versions shouldn't work with the pretrained models online, since they were pre-torch 2.X/transformers 4, so I am getting (and you should be getting):

ModuleNotFoundError: No module named 'transformers.modeling_bert'

Maybe you are only using it for segmentation? Were you able to get parsing to run with those library versions? Or did you retrain models?

@ztkuperman
Copy link

I've been trying to get hebpipe working as well. I'm facing the same issue.
I also modified a line in xgboost, changing collections.Mapping to collections.abc.mapping for compatibility with python 3.3+
These are the errors from different hebpipe options at the moment:
-w, -t: ModuleNotFoundError: No module named 'transformers.modeling_bert'
-p, -d:RuntimeError: Error(s) in loading state_dict for MTLModel: Unexpected key(s) in state_dict: "model.embeddings.position_ids".
-l, -e, -c: DepEdit WARN: head not set for token 1.0 in file (Same warning for all tokens)

@bfeif
Copy link
Author

bfeif commented Nov 18, 2023

@amir-zeldes I haven't tried anything besides pip install -r requirements.txt, that's probably why. Do you have a list of commands I should run in order to check that everything installs correctly?

@amir-zeldes
Copy link
Owner

@ztkuperman yes, those are all indications of the version incompatibilities. Let me repeat what I just answered on a similar thread on the repo for just the segmenter:

Yeah, I see the issue. This puts me in a bit of a dilemma, since we are not a software development company and don't really have the resources to keep up with each change in library versions... That said, I would be sad to see this tool become obsolete, so maybe we can occasionally retrain models.

OK, so two answers for now:

  1. If you want to test this right now, and are willing to run some older libraries in a venv or something, here is a known working configuration for the models you have from the download:

Python 3.8: (@bfeif this should also work for you if you can install a clean environment)

scikit-learn==0.23.2
joblib==1.3.2
numpy==1.21.0
pandas==1.5.3
xgboost==0.81
hyperopt==0.2.4
flair==0.6.1
transformers==3.5.1
torch==1.6.0
gensim==3.8.3
diaparser==1.1.2
  1. I will try to put up a model for a newer python, say 3.11, with torch > 2.0 and the latest transformers + xgboost (not promising to do this for every major version, but it's been a while so at least for now I'll make new models)

The segmenter model is training right now and looks like it will work fine. For the full hebpipe pipeline I think I have most things running on torch 2, but I don't know about training the MTL lemmatizer.

@nitinvwaran do you have some documentation on training the MTL transformer model including lemmatization?

@amir-zeldes
Copy link
Owner

For just the segmenter check out the preliminary fix here for torch 2.1, xgboost 2.0.2 and flair 0.13.0. I still need to refactor some things but could then update the rest of the pipeline. This seems to work fine on Python 3.11 at least.

@maayanorner
Copy link

maayanorner commented Feb 1, 2024

The only way I could have installed the recommended dependencies was by using:
pip install --use-deprecated=legacy-resolver -r py_38_requirements.txt
Because there are conflicts within the dependencies.

However, I still jump from error to error; seems like it would take a long time to resolve.
Can someone who has a working env pip freeze and provide the exact Python/pip versions? It would be super useful, as the listings are partial and both 3.11 (I used the exact same pip and python versions) and 3.8 do not work for me.

Thanks!

Edit:

@amir-zeldes I haven't tried anything besides pip install -r requirements.txt, that's probably why. Do you have a list of commands I should run in order to check that everything installs correctly?

To validate, your configuration does not work for me as well. Yes, it installs, but the code does not work due to compatibility issues (e.g., trying to use keys that don't exist).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants