Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move dependencies to pyproject toml and create ci/cd install and import test #37

Merged
merged 47 commits into from
Aug 20, 2024

Conversation

leifdenby
Copy link
Member

The aim of this PR is to move dependencies from requirements.txt to pyproject.toml (as has been proposed for v0.2.0 on the roadmap) and discussed on #27 (comment).

Requirements:

  • dependencies should still be installable with pip and pyproject.toml can be updated by hand
  • dependencies can be handle with pdm
  • ci/cd test that runs the pip install and runs a test that imports neural-lam to ensure that the required packages have correctly been installed.

@leifdenby
Copy link
Member Author

I will copy here my motivation for allow using of pdm from #27 (comment):

For someone not using pdm the process of setting up a new development environment would be:

  1. Clone repo git clone http://github.com/<username>/neural-lam and change directory cd neural-lam
  2. Optionally create your virtualenv with your favourite tool, maybe mkvirtualenv ... and then activate ...
  3. Install neural-lam editable with pip so that changes you make are reflected in the site-packages version (actually this is simply a link) pip install -e .
  4. Edit and run neural-lam: python -m neural_lam.train_model --config-file config.yaml
  5. Manually install a dependency with pip and edit pyproject.toml by hand, i.e. python -m pip install ... and edit the pyproject.toml file
  6. Stage, commit and push!

I like using a package manager like pdm (other options are pipenv, poetry, uv) because it makes the development process easier by 1) automatically updating package versions in pyproject.toml to ensure versions work together when I add a new package, 2) allowing me to "dry run" any dependency modification without clobbering my current env (this has hurt me quite a few times with pip) and 3) handling creating and activating virtuals envs

So with pdm the development process would be:

  1. Clone repo git clone http://github.com/<username>/neural-lam and change directory cd neural-lam
  2. Create virtual env pdm venv create and either activating it pdm venv activate ... or making it the default when using pdm run python ... with pdm use ...
  3. Install package pdm install
  4. Add dependency pdm add ... (this is where you can do pdm add --dry-run ... to see what would change before you install a package), or remove one with pdm remove .... You can also add dev dependencies separately (which won't be included in wheels) with pdm add --dev ... or dependency groups with pdm add --group visualisation matplotlib for example (if we didn't want visualisation tools installed by default)
  5. Edit and run neural-lam: python -m neural_lam.train_model --config-file config.yaml (if using activated virtualenv) or pdm run python -m neural_lam.train_model --config-file config.yaml
  6. Stage, commit and push!

@leifdenby leifdenby added this to the v0.2.0 milestone May 28, 2024
@leifdenby leifdenby mentioned this pull request May 30, 2024
8 tasks
@leifdenby
Copy link
Member Author

I see. Yes, I think using an optional extras group is preferable then, since it would work smoothly with both pip and pdm.

Cool. Yes I think you're right. I would imagine @sadamov, @SimonKamuk and @khintz are happy with this too. If I don't hear any complaints I will change that this morning and update the PR (README and pyproject.toml)

I will take that is your review so far @joeloskarsson 😀

@joeloskarsson
Copy link
Collaborator

I have looked over all the changes and I think it looks good. Want to do a full install on my end and see that everything works, but will wait with that until you've made that final change. I have not looked in detail on any of the github workflows, as I don't know that stuff so well. But I trust others to go over that.

Does this go in before or after #32 ?

@leifdenby
Copy link
Member Author

leifdenby commented Aug 14, 2024

I have looked over all the changes and I think it looks good. Want to do a full install on my end and see that everything works, but will wait with that until you've made that final change. I have not looked in detail on any of the github workflows, as I don't know that stuff so well. But I trust others to go over that.

Great, thanks!

Does this go in before or after #32 ?

After I would say. The other PR is about moving files around to make neural-lam into a package. Once we've done that I can make sure that package installs and runs correctly for both cpu and gpu support by adding the changes in this PR

Copy link
Collaborator

@sadamov sadamov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! Thanks, really also for the work on the github runners which must have been quite a hassle to set up 🚀

I have tried installation with pdm (4m30s) and pip (6m54s) (latest torch version) on my HPC system running CUDA 12.0 and can report that for both envs neuralLAM-training was successful. Tested both on cpu and gpu.

After capping the numpy version this can be merged, from my side. I trust that the --dev install will work after the change discussed above.

pyproject.toml Show resolved Hide resolved
Copy link
Collaborator

@joeloskarsson joeloskarsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the numpy version spec and the dev group name, then this is all good to go!

Something I realized, that's really good, is that we don't actually need any of the optional PyG dependencies atm. So this getting rid of them is great, and also leads to less complications with torch CPU/CUDA versions. We can just re-introduce them later if some will be needed.

pyproject.toml Outdated Show resolved Hide resolved
@sadamov sadamov mentioned this pull request Aug 19, 2024
3 tasks
@leifdenby
Copy link
Member Author

leifdenby commented Aug 20, 2024

Ok @sadamov and @joeloskarsson, I've moved the dev dependencies to the optional dependency groups and updated the ci/cd setups and README to reflect the change.

If you have time for a last quick look then I think this is ready to go in!

@joeloskarsson
Copy link
Collaborator

  • Now this includes ( a bit hidden) the upgrade to numpy 2.0. Can we make this a bit more transparent? For example by adding this specifically to the changelog, as it now just contains the entry for capping it to <2.0.
  • If we now are moving to allow numpy 2.0, should we actually make this >=2.0.0, so we do not need to make sure to support both numpy 1 and 2 in the future?

@leifdenby
Copy link
Member Author

  • Now this includes ( a bit hidden) the upgrade to numpy 2.0. Can we make this a bit more transparent? For example by adding this specifically to the changelog, as it now just contains the entry for capping it to <2.0.

  • If we now are moving to allow numpy 2.0, should we actually make this >=2.0.0, so we do not need to make sure to support both numpy 1 and 2 in the future?

Ah! Sorry, that was an oversight. I had somehow convinced myself we now did support numpy>=2.0.0, but of course #67 capped the version. I'm glad you noticed this. I will make another commit to cap the version as it was before in requirements.txt

@leifdenby
Copy link
Member Author

Ok, I fixed that issue now @joeloskarsson

@joeloskarsson
Copy link
Collaborator

joeloskarsson commented Aug 20, 2024

Oh no, sorry for the confusion. We want to remove the cap to <2.0 here, just make it clear that we do that. The code works with numpy 2, see #67

Copy link
Collaborator

@sadamov sadamov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I installed the dev envs with both pip and pdm now, it works!
One model training was also succesful with new envs.

I agree to remove the numpy cap and mention it in the changelog of this PR.

Happy merging 🎉

@leifdenby
Copy link
Member Author

Oh no, sorry for the confusion. We want to remove the cap to <2.0 here, just make it clear that we do that. The code works with numpy 2, see #67

Ah I see! Ok, so I will include a note in the CHANGELOG to say that we removed the cap to numpy<2.0.0 because it works. I'll draft that changelog entry now and revert the changes

Copy link
Collaborator

@joeloskarsson joeloskarsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Now everything looks good to me.

@leifdenby leifdenby merged commit 4969f92 into mllam:main Aug 20, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants