Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CCDH Pilot examples #22

Merged
merged 54 commits into from
Jan 5, 2022
Merged

Add CCDH Pilot examples #22

merged 54 commits into from
Jan 5, 2022

Conversation

gaurav
Copy link
Collaborator

@gaurav gaurav commented Oct 5, 2021

Adds the CCDH Pilots to the Example Data Repository. Some of these don't work, probably because of bugs in LinkML Runtime (#39). Others either can't be set up with the Python Data Classes or cannot be validated when they are (#40).

Changes requested:

  • Is there any documentation about where the YAMl data files in ccdh-pilot came from? I think it was Frankensteination of real values from various records from the GDC and PDC backends, and that it was performed by the Data Harmonization team. I think we should include some verbiage or linkage in a README in this folder. (from @turbomam)

Should be merged after PR #41

@turbomam
Copy link
Member

turbomam commented Dec 6, 2021

I did poetry update and poetry run pytest and all tests passed.

I get the following warning, as I do in some other LinkML base projects.

head-and-mouth/test_load.py::test_transform_gdc_data
/Users/MAM/Library/Caches/pypoetry/virtualenvs/crdch-example-workflows-ma-Vv354-py3.9/lib/python3.9/site-packages/rdflib_jsonld/init.py:9: DeprecationWarning: The rdflib-jsonld package has been integrated into rdflib as of rdflib==6.0.0. Please remove rdflib-jsonld from your project's dependencies.
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/warnings.html

@turbomam
Copy link
Member

turbomam commented Dec 6, 2021

Is there any documentation about where the YAMl data files in ccdh-pilot came from? I think it was Frankensteination of real values from various records from the GDC and PDC backends, and that it was performed by the Data Harmonization team. I think we should include some verbiage or linkage in a README in this folder.

@turbomam
Copy link
Member

turbomam commented Dec 6, 2021

@gaurav , when you say that some of the Pilot Demonstrators "don't work", do you mean that they don't validate on ingestions, or that they couldn't be created programmatically right now because of the decimal types incompatibility?

I'll be trying to answer this question for myself and possibly writing a test now.

@gaurav
Copy link
Collaborator Author

gaurav commented Dec 6, 2021

@gaurav , when you say that some of the Pilot Demonstrators "don't work", do you mean that they don't validate on ingestions, or that they couldn't be created programmatically right now because of the decimal types incompatibility?

Sorry, I should have been clearer! The YAML files included in this PR all "work" in that they pass validation, but I had to comment out some lines that either don't pass validation or that couldn't be loaded using the Python Data Classes. I think this is almost entirely because of cancerDHC/ccdhmodel#154, so I don't know that new tests are necessary. I have a list of all non-validating fields: #39

Base automatically changed from feature/replace-pipenv-with-poetry to main January 5, 2022 22:00
@gaurav gaurav merged commit 45bc7d3 into main Jan 5, 2022
@gaurav gaurav deleted the add-pilot-examples branch January 5, 2022 23:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants