Refactor codebase into a python package #32

leifdenby · 2024-05-16T19:34:33Z

This PR lays the groundwork for being able to install neural-lam as a package, thereby making it possible to run from anywhere once the package has been installed. This means that it would be possible (in theory) to train neural-lam on a .npy-file based dataset (once #31 has been merged so that the training configuration is moved from neural_lam/constants.py to a yaml-config file) with the neural-lam package installed into a user's site-packages (i.e. in their virtualenv).

I appreciate that currently most of us will be checking out the codebase to make modifications, and then train a model with neural-lam and so these changes might seem superfluous. But making these later will be a lot harder than doing them now.

The primary changes are:

move all *.py that are currently outside of neural_lam/ into that folder, but keep the files the same
change all examples of running the neural-lam "scripts", e.g. python create_mesh.py by python -m neural_lam.create_mesh in the README
change all absolute imports to package-relative imports, i.e. from . import utils rather than from neural_lam import utils
add tests that all the CLI entrypoints to neural_lam can be imported and add ci/cd action to run these tests

I still need to resolve the issue around depending on cpu or gpu versions of pytorch. I know @sadamov found a work-around where the cpu or gpu package was automatically picked based on whether CUDA is detected. I haven't had time to look into this yet, but once I have done that I will mark this PR as complete from my side and ask for your thoughts on it :)

This PR is definitely a work-in-progress and is meant to serve as a discussion point.

(also this PR includes changes that PR #29 adds, so this PR definitely shouldn't be merged or even reviewed probably before #29 is in)

…package

…/refactor-as-package

leifdenby · 2024-07-17T08:38:15Z

This looks great! Should the changelog also be updated for your changes?

Thanks! Yes, I thought I would complete the changelog once the reviews were done, but maybe I should just do it now :) I will do that

tests/test_mllam_dataset.py

sadamov

Thanks @leifdenby, totally agree that refactoring the code into a Python package is the way to go. I looked at the file changes and tested a full reinstall+training based on the README -> successful ✔️. This PR can be merged as is from my side, I do have some suggestions for this or a future PR:

The package can currently not be installed with pip install -e . because of the folder structure. I suggest to make the following changes:

Add these lines to pyproject.toml:

[project]
name = "neural-lam"
version = "0.1.0"

Move the neural-lam folder to src/neural-lam
Update the default path to the config yaml file in all relevant files to src/neural_lam/data_config.yaml
- plot_graph.py
- create_grid_features.py
- create_mesh.py
- create_parameter_weights.py
- train_model.py

Now installing the neural-lam package works:

joeloskarsson · 2024-07-21T13:50:20Z

Great stuff! Happy to give this a full review later this week, but likely I won't have much to add.

I agree that being able to install with pip install -e seems important (necessary for us working with developing the package?). Moving most code to src/neural-lam/... seems like annoyingly many sub-directories. Is there some other way to do achieve this?
I have also implicitly viewed this as part of the 0.2.0 release.

joeloskarsson · 2024-07-23T06:20:11Z

I think with this we should also adjust the installation instructions in the Readme to install as a package.

joeloskarsson

I think all changes are good 😄 I would propose to add a couple more things with this (summary of the same points I commented above, but here in a proper review):

Update the install instructions in the README to include installing the actual package. These should change even more once Move dependencies to pyproject toml and create ci/cd install and import test #37 is done, but if this PR makes it into a package it makes sense to add that there now.
Enable installing with pip install -e .. I figured out a better way (than putting everything in a src dir) to achieve this is to add to pyproject.toml

[tool.setuptools]
py-modules = ["neural_lam"]

Add __init__.py files to neural_lam and neural_lam/models that exposes submodules and classes directly in the imported neural_lam module. That way you can do something like

import neural_lam as nl

a = nl.utils.load_graph(...)

and it means that you can e.g. import each model directly from neural_lam.models.

If points 2 and 3 above sound good I put those changes already now on the branch package_inits in this repo, so you can just merge them in (diff with this PR: leifdenby/neural-lam@maint/refactor-as-package...mllam:neural-lam:package_inits)

joeloskarsson · 2024-07-24T13:30:10Z

I couldn't get the linting all green on package_inits branch. Looks like flake8 does not use the options we have specified in pyproject.toml (https://flake8.pycqa.org/en/latest/user/configuration.html#configuration-locations). Maybe we can use https://github.com/csachs/pyproject-flake8 to fix that?

leifdenby · 2024-08-14T07:34:38Z

change all absolute imports to package-relative imports, i.e. from . import utils rather than from neural_lam import utils

I learnt yesterday from @khintz that I have for years been wrong about relative vs absolute imports. I have for as long as I can remember I thought that relative imports were recommended over absolute imports. But I realise now this is incorrect. I do still prefer them, but I was wondering what people think. I would be ok with sticking with absolute imports if that is the consensus. Maybe a thumbs-up for relative imports and thumbs-down for relative imports on this comment could serve as a vote? @joeloskarsson, @sadamov, @SimonKamuk, @khintz

leifdenby · 2024-08-14T07:35:42Z

Moving most code to src/neural-lam/... seems like annoyingly many sub-directories. Is there some other way to do achieve this?

We don't have to add the src/ path prefix, but neural-lam/ would be enough. How about we stick with that?

khintz · 2024-08-14T07:40:00Z

I don't have very strong feelings about relative vs absolute imports, but if I have to choose one, I would go for absolute. That is easier to read (IMO) and there is general consensus towards that.

leifdenby · 2024-08-14T07:40:37Z

3. Add __init__.py files to neural_lam and neural_lam/models that exposes submodules and classes directly in the imported neural_lam module.

I see the merit of this @joeloskarsson, but it does mean that the time it takes to import neural-lam is increased. I think this why people generally don't do this by default, but I am ok with doing this. I don't see it as necessary though to make the codebase into a package.

Package inits

leifdenby · 2024-08-14T07:54:39Z

I couldn't get the linting all green on package_inits branch. Looks like flake8 does not use the options we have specified in pyproject.toml (https://flake8.pycqa.org/en/latest/user/configuration.html#configuration-locations). Maybe we can use https://github.com/csachs/pyproject-flake8 to fix that?

Thanks @joeloskarsson, you're right. I tried adding pyproject-flake8 to the pre-commit config, but that didn't work. On the other hand it looks like https://github.com/john-hen/Flake8-pyproject did 😄

leifdenby · 2024-08-14T08:02:03Z

I think all changes are good 😄 I would propose to add a couple more things with this (summary of the same points I commented above, but here in a proper review):

Thanks @joeloskarsson! It was really helpful to have this list. I have merged in your changes and found a way around the flake8 issues (as you highlighted too). I have simply changed the install instructions to reflect that once torch is installed then the rest can be installed with pip install -e ., that was what you were intending too, right? I think this ready for a last review from you if you have time.

joeloskarsson

I don't really have an opinion on relative vs absolute imports in the package. Any way is fine by me.

(about importing sub-modules in __init__.py)

I see the merit of this @joeloskarsson, but it does mean that the time it takes to import neural-lam is increased. I think this why people generally don't do this by default, but I am ok with doing this. I don't see it as necessary though to make the codebase into a package.

Yes that is a good point. I suppose there is a tradeoff here in terms of speed vs convenience. I think we can go with the convenience route, as the package is still fairly small and there is not a lot to import. I am quite used to this as both torch and pyg does import the submodules in __init__.py, so there could also be a point in being consistent with them.

I have simply changed the install instructions to reflect that once torch is installed then the rest can be installed with pip install -e ., that was what you were intending too, right?

Yes, that seems good. No need to over-complicate the installation setup and instructions here when we have more changes around that coming in #37

Great that you could get the flake8 fix to work!

All looks good to me.

leifdenby · 2024-08-19T13:27:49Z

All looks good to me.

Great! Thank you. I will update the CHANGELOG and then merge

…nby/neural-lam into maint/refactor-as-package

…/refactor-as-package

leifdenby added 30 commits May 13, 2024 18:48

wip on simplifying pre-commit setup

0afdfee

setup pylint version

28118a6

remove external deps install in cicd linting

3da3108

create project

ea64309

replace absolute imports with relative

0c68537

simplify black config

4b77be6

headers for import sections no longer needed

f2bae03

minor fixes

1d12b0d

run on all branch pushes

ad0accc

rename action to "lint"

3e69502

add ci/cd test for imports

681c7b1

py version must be quoted

5ad0230

fix torch install url

35987e5

use pdm in ci/cd

148d7f6

disable cache for now

b912d1a

check in lock file

b656445

add pytest

248196f

cache in cicd

1af1576

add torch-geometric to deps

7ed7c97

fix import and more tests

2869952

Merge branch 'maint/simplify-precommit-setup' into maint/refactor-as-…

9aaaecd

…package

pdm to sync to requirements.txt

358c8d6

update requirements.txt

6c3bdce

more import tests

fbd6a2b

Merge branch 'main' into maint/refactor-as-package

e89facc

Merge branch 'main' of https://github.com/mllam/neural-lam into maint…

0b5687a

…/refactor-as-package

turn meps testdata download into pytest fixture

095fdbc

adapt README for package

49e9bfe

remove pdm cicd test (will be in separate PR)

12cc02b

remove pdm in gitignore

b47f50b

SimonKamuk reviewed Jul 17, 2024

View reviewed changes

tests/test_mllam_dataset.py Show resolved Hide resolved

sadamov approved these changes Jul 18, 2024

View reviewed changes

Add init files to expose classes in editable package

2a6796c

joeloskarsson requested changes Jul 24, 2024

View reviewed changes

Linting

8f4e0e0

joeloskarsson added this to the v0.2.0 milestone Aug 5, 2024

leifdenby mentioned this pull request Aug 13, 2024

Add "datastores" to represent input data from zarr, npy, etc #66

Open

20 tasks

joeloskarsson mentioned this pull request Aug 14, 2024

Move dependencies to pyproject toml and create ci/cd install and import test #37

Merged

leifdenby and others added 3 commits August 14, 2024 09:43

Merge pull request #1 from mllam/package_inits

e7cf2c0

Package inits

add pyproject-flake8 to precommit config

0b72e9d

use Flake8-pyproject instead

190d1de

update README

791af0a

joeloskarsson approved these changes Aug 14, 2024

View reviewed changes

sadamov mentioned this pull request Aug 19, 2024

Numpy 2 compatability #67

Closed

3 tasks

leifdenby added 3 commits August 19, 2024 15:35

update changelog

47c1f44

Merge branch 'maint/refactor-as-package' of https://github.com/leifde…

9f25375

…nby/neural-lam into maint/refactor-as-package

Merge branch 'main' of https://github.com/mllam/neural-lam into maint…

b1ecb2c

…/refactor-as-package

leifdenby merged commit a54c45f into mllam:main Aug 19, 2024
8 checks passed

leifdenby mentioned this pull request Aug 20, 2024

add package build and upload to pypi for releases into ci/cd setup #71

Open

20 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor codebase into a python package #32

Refactor codebase into a python package #32

leifdenby commented May 16, 2024 •

edited

Loading

leifdenby commented Jul 17, 2024

sadamov left a comment

joeloskarsson commented Jul 21, 2024

joeloskarsson commented Jul 23, 2024

joeloskarsson left a comment

joeloskarsson commented Jul 24, 2024

leifdenby commented Aug 14, 2024

leifdenby commented Aug 14, 2024

khintz commented Aug 14, 2024

leifdenby commented Aug 14, 2024

leifdenby commented Aug 14, 2024

leifdenby commented Aug 14, 2024

joeloskarsson left a comment

leifdenby commented Aug 19, 2024

Refactor codebase into a python package #32

Refactor codebase into a python package #32

Conversation

leifdenby commented May 16, 2024 • edited Loading

leifdenby commented Jul 17, 2024

sadamov left a comment

Choose a reason for hiding this comment

joeloskarsson commented Jul 21, 2024

joeloskarsson commented Jul 23, 2024

joeloskarsson left a comment

Choose a reason for hiding this comment

joeloskarsson commented Jul 24, 2024

leifdenby commented Aug 14, 2024

leifdenby commented Aug 14, 2024

khintz commented Aug 14, 2024

leifdenby commented Aug 14, 2024

leifdenby commented Aug 14, 2024

leifdenby commented Aug 14, 2024

joeloskarsson left a comment

Choose a reason for hiding this comment

leifdenby commented Aug 19, 2024

leifdenby commented May 16, 2024 •

edited

Loading