Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize full pipeline configurations #469

Merged
merged 24 commits into from
Aug 8, 2024

Conversation

mdekstrand
Copy link
Member

@mdekstrand mdekstrand commented Aug 7, 2024

This extends the configuration support to enable entire pipeline configurations to be serialized and deserialized. It is also a prerequisite for implementing #385. This version of the changes only generates the serialization — it does not save or load them to files, that will be a separate change.

It also renames and refactors some pipeline test files for clarity and improves error handling.

Working:

  • Input nodes (and their types)
  • Component nodes
  • Component wirings
  • Fallback nodes
  • Literal nodes
  • Pipeline configuration hashing
  • Pipeline metadata

@mdekstrand mdekstrand added the internals Internal infrastructure (parallelism, math, etc.) label Aug 7, 2024
@mdekstrand mdekstrand added this to the 2024.1 milestone Aug 7, 2024
@mdekstrand mdekstrand self-assigned this Aug 7, 2024
@@ -139,7 +143,7 @@ def create_input(self, name: str, *types: type[T] | None) -> Node[T]:
"""
self._check_available_name(name)

node = InputNode[Any](name, types=set((t if t is not None else type[None]) for t in types))
node = InputNode[Any](name, types=set((t if t is not None else type(None)) for t in types))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugfix for existing code around None.

Copy link

github-actions bot commented Aug 7, 2024

The GitHub 🤖 has run the tests on your PR.

Covered 96.00% of diff (coverage changed 0.10% from 92.05% to 92.15%).

origin/main...HEAD, staged and unstaged changes
  • lenskit/lenskit/pipeline/init.py (100%)
  • lenskit/lenskit/pipeline/components.py (86.7%): Missing lines 206,209
  • lenskit/lenskit/pipeline/config.py (100%)
  • lenskit/lenskit/pipeline/types.py (91.7%): Missing lines 137,156

Summary

  • Total: 100 lines
  • Missing: 4 lines
  • Coverage: 96%
Source Coverage Report
Name Stmts Miss Cover
lenskit-funksvd/lenskit/funksvd.py 187 8 96%
lenskit-hpf/lenskit/hpf.py 24 0 100%
lenskit-implicit/lenskit/implicit.py 94 9 90%
lenskit/lenskit/algorithms/__init__.py 67 8 88%
lenskit/lenskit/algorithms/als/__init__.py 3 0 100%
lenskit/lenskit/algorithms/als/common.py 128 2 98%
lenskit/lenskit/algorithms/als/explicit.py 121 3 98%
lenskit/lenskit/algorithms/als/implicit.py 112 1 99%
lenskit/lenskit/algorithms/basic.py 161 4 98%
lenskit/lenskit/algorithms/bias.py 150 3 98%
lenskit/lenskit/algorithms/knn/__init__.py 3 0 100%
lenskit/lenskit/algorithms/knn/item.py 303 17 94%
lenskit/lenskit/algorithms/knn/user.py 178 11 94%
lenskit/lenskit/algorithms/mf_common.py 61 0 100%
lenskit/lenskit/algorithms/ranking.py 75 11 85%
lenskit/lenskit/algorithms/svd.py 75 4 95%
lenskit/lenskit/batch/__init__.py 2 0 100%
lenskit/lenskit/batch/_predict.py 30 2 93%
lenskit/lenskit/batch/_recommend.py 46 4 91%
lenskit/lenskit/crossfold.py 136 2 99%
lenskit/lenskit/data/__init__.py 11 0 100%
lenskit/lenskit/data/checks.py 37 0 100%
lenskit/lenskit/data/dataset.py 365 19 95%
lenskit/lenskit/data/fetch.py 38 28 26%
lenskit/lenskit/data/items.py 182 13 93%
lenskit/lenskit/data/matrix.py 115 5 96%
lenskit/lenskit/data/movielens.py 96 18 81%
lenskit/lenskit/data/mtarray.py 57 3 95%
lenskit/lenskit/data/tables.py 25 0 100%
lenskit/lenskit/data/user.py 30 9 70%
lenskit/lenskit/data/vocab.py 84 7 92%
lenskit/lenskit/diagnostics.py 4 0 100%
lenskit/lenskit/math/__init__.py 0 0 100%
lenskit/lenskit/math/solve.py 6 0 100%
lenskit/lenskit/metrics/__init__.py 0 0 100%
lenskit/lenskit/metrics/predict.py 32 0 100%
lenskit/lenskit/metrics/topn.py 212 1 99%
lenskit/lenskit/parallel/__init__.py 4 0 100%
lenskit/lenskit/parallel/chunking.py 20 1 95%
lenskit/lenskit/parallel/config.py 65 8 88%
lenskit/lenskit/parallel/invoker.py 31 2 94%
lenskit/lenskit/parallel/pool.py 54 9 83%
lenskit/lenskit/parallel/sequential.py 22 0 100%
lenskit/lenskit/parallel/serialize.py 51 1 98%
lenskit/lenskit/parallel/worker.py 43 3 93%
lenskit/lenskit/pipeline/__init__.py 208 7 97%
lenskit/lenskit/pipeline/common.py 5 1 80%
lenskit/lenskit/pipeline/components.py 40 2 95%
lenskit/lenskit/pipeline/config.py 37 0 100%
lenskit/lenskit/pipeline/nodes.py 49 1 98%
lenskit/lenskit/pipeline/runner.py 84 1 99%
lenskit/lenskit/pipeline/state.py 40 8 80%
lenskit/lenskit/pipeline/types.py 79 4 95%
lenskit/lenskit/splitting/__init__.py 4 0 100%
lenskit/lenskit/splitting/holdout.py 56 4 93%
lenskit/lenskit/splitting/records.py 56 0 100%
lenskit/lenskit/splitting/split.py 27 6 78%
lenskit/lenskit/splitting/users.py 62 0 100%
lenskit/lenskit/topn.py 109 25 77%
lenskit/lenskit/types.py 38 12 68%
lenskit/lenskit/util/__init__.py 72 19 74%
lenskit/lenskit/util/envcheck.py 57 44 23%
lenskit/lenskit/util/logging.py 19 0 100%
lenskit/lenskit/util/random.py 26 3 88%
lenskit/lenskit/util/test.py 103 19 82%
lenskit/lenskit/util/timing.py 28 0 100%
TOTAL 4739 372 92%

Copy link

github-actions bot commented Aug 8, 2024

The GitHub 🤖 has run the tests on your PR.

Covered 93.65% of diff (coverage changed 0.05% from 92.05% to 92.11%).

origin/main...HEAD, staged and unstaged changes
  • lenskit/lenskit/pipeline/init.py (91.8%): Missing lines 417,454,457,465
  • lenskit/lenskit/pipeline/components.py (86.7%): Missing lines 206,209
  • lenskit/lenskit/pipeline/config.py (100%)
  • lenskit/lenskit/pipeline/types.py (91.7%): Missing lines 137,156

Summary

  • Total: 126 lines
  • Missing: 8 lines
  • Coverage: 93%
Source Coverage Report
Name Stmts Miss Cover
lenskit-funksvd/lenskit/funksvd.py 187 8 96%
lenskit-hpf/lenskit/hpf.py 24 0 100%
lenskit-implicit/lenskit/implicit.py 94 9 90%
lenskit/lenskit/algorithms/__init__.py 67 8 88%
lenskit/lenskit/algorithms/als/__init__.py 3 0 100%
lenskit/lenskit/algorithms/als/common.py 128 2 98%
lenskit/lenskit/algorithms/als/explicit.py 121 3 98%
lenskit/lenskit/algorithms/als/implicit.py 112 1 99%
lenskit/lenskit/algorithms/basic.py 161 4 98%
lenskit/lenskit/algorithms/bias.py 150 3 98%
lenskit/lenskit/algorithms/knn/__init__.py 3 0 100%
lenskit/lenskit/algorithms/knn/item.py 303 17 94%
lenskit/lenskit/algorithms/knn/user.py 178 11 94%
lenskit/lenskit/algorithms/mf_common.py 61 0 100%
lenskit/lenskit/algorithms/ranking.py 75 11 85%
lenskit/lenskit/algorithms/svd.py 75 4 95%
lenskit/lenskit/batch/__init__.py 2 0 100%
lenskit/lenskit/batch/_predict.py 30 2 93%
lenskit/lenskit/batch/_recommend.py 46 4 91%
lenskit/lenskit/crossfold.py 136 2 99%
lenskit/lenskit/data/__init__.py 11 0 100%
lenskit/lenskit/data/checks.py 37 0 100%
lenskit/lenskit/data/dataset.py 365 19 95%
lenskit/lenskit/data/fetch.py 38 28 26%
lenskit/lenskit/data/items.py 182 13 93%
lenskit/lenskit/data/matrix.py 115 5 96%
lenskit/lenskit/data/movielens.py 96 18 81%
lenskit/lenskit/data/mtarray.py 57 3 95%
lenskit/lenskit/data/tables.py 25 0 100%
lenskit/lenskit/data/user.py 30 9 70%
lenskit/lenskit/data/vocab.py 84 7 92%
lenskit/lenskit/diagnostics.py 4 0 100%
lenskit/lenskit/math/__init__.py 0 0 100%
lenskit/lenskit/math/solve.py 6 0 100%
lenskit/lenskit/metrics/__init__.py 0 0 100%
lenskit/lenskit/metrics/predict.py 32 0 100%
lenskit/lenskit/metrics/topn.py 212 1 99%
lenskit/lenskit/parallel/__init__.py 4 0 100%
lenskit/lenskit/parallel/chunking.py 20 1 95%
lenskit/lenskit/parallel/config.py 65 8 88%
lenskit/lenskit/parallel/invoker.py 31 2 94%
lenskit/lenskit/parallel/pool.py 54 9 83%
lenskit/lenskit/parallel/sequential.py 22 0 100%
lenskit/lenskit/parallel/serialize.py 51 1 98%
lenskit/lenskit/parallel/worker.py 43 3 93%
lenskit/lenskit/pipeline/__init__.py 233 11 95%
lenskit/lenskit/pipeline/common.py 5 1 80%
lenskit/lenskit/pipeline/components.py 40 2 95%
lenskit/lenskit/pipeline/config.py 38 0 100%
lenskit/lenskit/pipeline/nodes.py 49 1 98%
lenskit/lenskit/pipeline/runner.py 84 1 99%
lenskit/lenskit/pipeline/state.py 40 8 80%
lenskit/lenskit/pipeline/types.py 79 4 95%
lenskit/lenskit/splitting/__init__.py 4 0 100%
lenskit/lenskit/splitting/holdout.py 56 4 93%
lenskit/lenskit/splitting/records.py 56 0 100%
lenskit/lenskit/splitting/split.py 27 6 78%
lenskit/lenskit/splitting/users.py 62 0 100%
lenskit/lenskit/topn.py 109 25 77%
lenskit/lenskit/types.py 38 12 68%
lenskit/lenskit/util/__init__.py 72 19 74%
lenskit/lenskit/util/envcheck.py 57 44 23%
lenskit/lenskit/util/logging.py 19 0 100%
lenskit/lenskit/util/random.py 26 3 88%
lenskit/lenskit/util/test.py 103 19 82%
lenskit/lenskit/util/timing.py 28 0 100%
TOTAL 4765 376 92%

Copy link

github-actions bot commented Aug 8, 2024

The GitHub 🤖 has run the tests on your PR.

Covered 94.84% of diff (coverage changed 0.10% from 92.05% to 92.16%).

origin/main...HEAD, staged and unstaged changes
  • lenskit/lenskit/pipeline/init.py (94.3%): Missing lines 446,502,505,513
  • lenskit/lenskit/pipeline/components.py (86.7%): Missing lines 206,209
  • lenskit/lenskit/pipeline/config.py (100%)
  • lenskit/lenskit/pipeline/types.py (91.7%): Missing lines 137,156

Summary

  • Total: 155 lines
  • Missing: 8 lines
  • Coverage: 94%
Source Coverage Report
Name Stmts Miss Cover
lenskit-funksvd/lenskit/funksvd.py 187 8 96%
lenskit-hpf/lenskit/hpf.py 24 0 100%
lenskit-implicit/lenskit/implicit.py 94 9 90%
lenskit/lenskit/algorithms/__init__.py 67 8 88%
lenskit/lenskit/algorithms/als/__init__.py 3 0 100%
lenskit/lenskit/algorithms/als/common.py 128 2 98%
lenskit/lenskit/algorithms/als/explicit.py 121 3 98%
lenskit/lenskit/algorithms/als/implicit.py 112 1 99%
lenskit/lenskit/algorithms/basic.py 161 4 98%
lenskit/lenskit/algorithms/bias.py 150 3 98%
lenskit/lenskit/algorithms/knn/__init__.py 3 0 100%
lenskit/lenskit/algorithms/knn/item.py 303 17 94%
lenskit/lenskit/algorithms/knn/user.py 178 11 94%
lenskit/lenskit/algorithms/mf_common.py 61 0 100%
lenskit/lenskit/algorithms/ranking.py 75 11 85%
lenskit/lenskit/algorithms/svd.py 75 4 95%
lenskit/lenskit/batch/__init__.py 2 0 100%
lenskit/lenskit/batch/_predict.py 30 2 93%
lenskit/lenskit/batch/_recommend.py 46 4 91%
lenskit/lenskit/crossfold.py 136 2 99%
lenskit/lenskit/data/__init__.py 11 0 100%
lenskit/lenskit/data/checks.py 37 0 100%
lenskit/lenskit/data/dataset.py 365 19 95%
lenskit/lenskit/data/fetch.py 38 28 26%
lenskit/lenskit/data/items.py 182 13 93%
lenskit/lenskit/data/matrix.py 115 5 96%
lenskit/lenskit/data/movielens.py 96 18 81%
lenskit/lenskit/data/mtarray.py 57 3 95%
lenskit/lenskit/data/tables.py 25 0 100%
lenskit/lenskit/data/user.py 30 9 70%
lenskit/lenskit/data/vocab.py 84 7 92%
lenskit/lenskit/diagnostics.py 4 0 100%
lenskit/lenskit/math/__init__.py 0 0 100%
lenskit/lenskit/math/solve.py 6 0 100%
lenskit/lenskit/metrics/__init__.py 0 0 100%
lenskit/lenskit/metrics/predict.py 32 0 100%
lenskit/lenskit/metrics/topn.py 212 1 99%
lenskit/lenskit/parallel/__init__.py 4 0 100%
lenskit/lenskit/parallel/chunking.py 20 1 95%
lenskit/lenskit/parallel/config.py 65 8 88%
lenskit/lenskit/parallel/invoker.py 31 2 94%
lenskit/lenskit/parallel/pool.py 54 9 83%
lenskit/lenskit/parallel/sequential.py 22 0 100%
lenskit/lenskit/parallel/serialize.py 51 1 98%
lenskit/lenskit/parallel/worker.py 43 3 93%
lenskit/lenskit/pipeline/__init__.py 253 11 96%
lenskit/lenskit/pipeline/common.py 5 1 80%
lenskit/lenskit/pipeline/components.py 40 2 95%
lenskit/lenskit/pipeline/config.py 46 0 100%
lenskit/lenskit/pipeline/nodes.py 49 1 98%
lenskit/lenskit/pipeline/runner.py 84 1 99%
lenskit/lenskit/pipeline/state.py 40 8 80%
lenskit/lenskit/pipeline/types.py 79 4 95%
lenskit/lenskit/splitting/__init__.py 4 0 100%
lenskit/lenskit/splitting/holdout.py 56 4 93%
lenskit/lenskit/splitting/records.py 56 0 100%
lenskit/lenskit/splitting/split.py 27 6 78%
lenskit/lenskit/splitting/users.py 62 0 100%
lenskit/lenskit/topn.py 109 25 77%
lenskit/lenskit/types.py 38 12 68%
lenskit/lenskit/util/__init__.py 72 19 74%
lenskit/lenskit/util/envcheck.py 57 44 23%
lenskit/lenskit/util/logging.py 19 0 100%
lenskit/lenskit/util/random.py 26 3 88%
lenskit/lenskit/util/test.py 103 19 82%
lenskit/lenskit/util/timing.py 28 0 100%
TOTAL 4793 376 92%

@mdekstrand mdekstrand marked this pull request as ready for review August 8, 2024 19:51
Copy link

github-actions bot commented Aug 8, 2024

The GitHub 🤖 has run the tests on your PR.

Covered 94.08% of diff (coverage changed 0.07% from 92.05% to 92.13%).

origin/main...HEAD, staged and unstaged changes
  • lenskit/lenskit/pipeline/init.py (94.2%): Missing lines 470,530,533,541
  • lenskit/lenskit/pipeline/components.py (86.7%): Missing lines 206,209
  • lenskit/lenskit/pipeline/config.py (100%)
  • lenskit/lenskit/pipeline/nodes.py (66.7%): Missing lines 101,106
  • lenskit/lenskit/pipeline/runner.py (100%)
  • lenskit/lenskit/pipeline/types.py (91.7%): Missing lines 137,156

Summary

  • Total: 169 lines
  • Missing: 10 lines
  • Coverage: 94%
Source Coverage Report
Name Stmts Miss Cover
lenskit-funksvd/lenskit/funksvd.py 187 8 96%
lenskit-hpf/lenskit/hpf.py 24 0 100%
lenskit-implicit/lenskit/implicit.py 94 9 90%
lenskit/lenskit/algorithms/__init__.py 67 8 88%
lenskit/lenskit/algorithms/als/__init__.py 3 0 100%
lenskit/lenskit/algorithms/als/common.py 128 2 98%
lenskit/lenskit/algorithms/als/explicit.py 121 3 98%
lenskit/lenskit/algorithms/als/implicit.py 112 1 99%
lenskit/lenskit/algorithms/basic.py 161 4 98%
lenskit/lenskit/algorithms/bias.py 150 3 98%
lenskit/lenskit/algorithms/knn/__init__.py 3 0 100%
lenskit/lenskit/algorithms/knn/item.py 303 17 94%
lenskit/lenskit/algorithms/knn/user.py 178 11 94%
lenskit/lenskit/algorithms/mf_common.py 61 0 100%
lenskit/lenskit/algorithms/ranking.py 75 11 85%
lenskit/lenskit/algorithms/svd.py 75 4 95%
lenskit/lenskit/batch/__init__.py 2 0 100%
lenskit/lenskit/batch/_predict.py 30 2 93%
lenskit/lenskit/batch/_recommend.py 46 4 91%
lenskit/lenskit/crossfold.py 136 2 99%
lenskit/lenskit/data/__init__.py 11 0 100%
lenskit/lenskit/data/checks.py 37 0 100%
lenskit/lenskit/data/dataset.py 365 19 95%
lenskit/lenskit/data/fetch.py 38 28 26%
lenskit/lenskit/data/items.py 182 13 93%
lenskit/lenskit/data/matrix.py 115 5 96%
lenskit/lenskit/data/movielens.py 96 18 81%
lenskit/lenskit/data/mtarray.py 57 3 95%
lenskit/lenskit/data/tables.py 25 0 100%
lenskit/lenskit/data/user.py 30 9 70%
lenskit/lenskit/data/vocab.py 84 7 92%
lenskit/lenskit/diagnostics.py 4 0 100%
lenskit/lenskit/math/__init__.py 0 0 100%
lenskit/lenskit/math/solve.py 6 0 100%
lenskit/lenskit/metrics/__init__.py 0 0 100%
lenskit/lenskit/metrics/predict.py 32 0 100%
lenskit/lenskit/metrics/topn.py 212 1 99%
lenskit/lenskit/parallel/__init__.py 4 0 100%
lenskit/lenskit/parallel/chunking.py 20 1 95%
lenskit/lenskit/parallel/config.py 65 8 88%
lenskit/lenskit/parallel/invoker.py 31 2 94%
lenskit/lenskit/parallel/pool.py 54 9 83%
lenskit/lenskit/parallel/sequential.py 22 0 100%
lenskit/lenskit/parallel/serialize.py 51 1 98%
lenskit/lenskit/parallel/worker.py 43 3 93%
lenskit/lenskit/pipeline/__init__.py 251 11 96%
lenskit/lenskit/pipeline/common.py 5 1 80%
lenskit/lenskit/pipeline/components.py 40 2 95%
lenskit/lenskit/pipeline/config.py 52 0 100%
lenskit/lenskit/pipeline/nodes.py 54 3 94%
lenskit/lenskit/pipeline/runner.py 84 1 99%
lenskit/lenskit/pipeline/state.py 40 8 80%
lenskit/lenskit/pipeline/types.py 79 4 95%
lenskit/lenskit/splitting/__init__.py 4 0 100%
lenskit/lenskit/splitting/holdout.py 56 4 93%
lenskit/lenskit/splitting/records.py 56 0 100%
lenskit/lenskit/splitting/split.py 27 6 78%
lenskit/lenskit/splitting/users.py 62 0 100%
lenskit/lenskit/topn.py 109 25 77%
lenskit/lenskit/types.py 38 12 68%
lenskit/lenskit/util/__init__.py 72 19 74%
lenskit/lenskit/util/envcheck.py 57 44 23%
lenskit/lenskit/util/logging.py 19 0 100%
lenskit/lenskit/util/random.py 26 3 88%
lenskit/lenskit/util/test.py 103 19 82%
lenskit/lenskit/util/timing.py 28 0 100%
TOTAL 4802 378 92%

@mdekstrand mdekstrand merged commit 1411196 into lenskit:main Aug 8, 2024
38 checks passed
@mdekstrand mdekstrand deleted the feature/pipeline-serialization branch August 8, 2024 20:52
@mdekstrand mdekstrand added dependencies Pull requests that update a dependency file pipeline LensKit pipeline abstraction and removed internals Internal infrastructure (parallelism, math, etc.) dependencies Pull requests that update a dependency file labels Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file pipeline LensKit pipeline abstraction
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

1 participant