Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cellxgene-schema CLI must validate schema_version #607

Closed
brianraymor opened this issue Aug 23, 2023 · 4 comments
Closed

cellxgene-schema CLI must validate schema_version #607

brianraymor opened this issue Aug 23, 2023 · 4 comments
Assignees
Labels
4.0 Next major CELLxGENE schema version curation software dp Data Platform Team work

Comments

@brianraymor
Copy link
Contributor

brianraymor commented Aug 23, 2023

Context

See single-cell-four.

As part of automated migration, the schema transitioned from curators annotating schema_version to CELLxGENE Discover annotating schema_version. For 3.1, schema_version is temporarily overwritten as part of the transition:

When a dataset is uploaded, CELLxGENE Discover MUST automatically add the schema_version key and its value to uns. If schema_version is already defined, then its value MUST be overwritten.

In 4.0, curators must not annotate this field. An error should be thrown blocking validation if it is provided, rather than the overwriting behavior it currently uses.

@nayib-jose-gloria
Copy link
Contributor

@jahilton Ready for testing! For all testing, you can pull the latest release candidate version of 4.0.0 by running:
pip install git+https://github.com/chanzuckerberg/single-cell-curation/@main#subdirectory=cellxgene_schema_cli

@jahilton
Copy link
Collaborator

Unexpected behavior: the file will pass validation if uns['schema_version'] = None
Expected to fail because 'schema_version' is present in uns.

Sample adata was set by adata.uns['schema_version'] = None & writing to .h5ad

@nayib-jose-gloria
Copy link
Contributor

nayib-jose-gloria commented Sep 14, 2023

@jahilton mind double checking that the uns['schema_version'] = None actually was written to the h5ad? I tried to recreate locally, and it seems like anndata doesn't actually save dict keys with None values to the output h5ad when you use the '.write' function (someone else reported this issue here)

@jahilton
Copy link
Collaborator

You are correct. It drops when saving so the tested .h5ad didn't actually have uns.schema_version.
In that case, all clear!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4.0 Next major CELLxGENE schema version curation software dp Data Platform Team work
Projects
None yet
Development

No branches or pull requests

3 participants