Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow missing/NA-filled columns for sdrf2openms conversion #173

Open
jpfeuffer opened this issue Sep 23, 2024 · 3 comments
Open

Allow missing/NA-filled columns for sdrf2openms conversion #173

jpfeuffer opened this issue Sep 23, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@jpfeuffer
Copy link
Collaborator

jpfeuffer commented Sep 23, 2024

In quantms, for simple experiments, I would like to allow using SDRF only to specify the experimental design (incl. channels, replicates, fractions), not mass spec. related settings (and use nextflow parameters to specify the rest).
This is mainly because it is much easier to quickly set enzyme, modifications, tolerance etc. in the nextflow WebUI instead of editing a tsv file with ontology names etc.

For this, the first step would be to allow those three columns to be missing or the entries being "Not Available" in the "sdrf to openms" conversion tool, because we use the config file that comes out of this to fill the meta information channel in nextflow.

As soon as that works, I could change the create_input_channel module to check for missingness in the openms.tsv and use the nextflow params as a fallback.

As discussed with @ypriverol

@jpfeuffer jpfeuffer added the enhancement New feature or request label Sep 23, 2024
@jpfeuffer
Copy link
Collaborator Author

I guess what I am asking is to allow a "validator" for certain groups of columns that interprets them as optional.
I.e. if the instrument type column is present, validate it, otherwise it is fine.

I am happy to use my own validators in our downstream code if you can give me some hints on how to do that.

I also would like to change the composition of the column groups: e.g., I believe that things like technical replicate and fraction identifier better belong to the "experimental design" group of validators.

@jpfeuffer
Copy link
Collaborator Author

jpfeuffer commented Nov 4, 2024

Alternatively I could put NOT AVAILABLE in all rows of those columns, but I think the openms-convert functionality does not correctly handle columns with NOT AVAILABLE, since it will try to fill it with some defaults:
https://github.com/bigbio/sdrf-pipelines/blob/main/sdrf_pipelines/openms/openms.py#L307

It will only work with missing columns, as far as I can see.

IF we somehow ultimately could pass through the missingness until the openms_config.tsv file, we could then check here:
https://github.com/bigbio/quantms/blob/dev/subworkflows/local/create_input_channel.nf#L114
if those columns exist, and if not, warn and fall back to the nextflow params.

@jpfeuffer
Copy link
Collaborator Author

@ypriverol @daichengxin Any thoughts on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant