You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The concept of a data set is slightly overloaded. I recognize data sets
that are collected in the project, where the "quality assurance" is especially interested,
that are intermediary steps in the analysis
that are products of the project, and will be distributed.
Is the model supposed to be able to document each of these? Are they mostly differentiated based on the fact that the "product" data sets contain "distribution"s? It may be good to add something about this to the documentation.
The text was updated successfully, but these errors were encountered:
Currently, there is no field that explictly states whether a dataset is reused or produced. The standard differentiates between existing and non-existing datasets by setting dates appropriately (see FAQ). For the time being, we are able to express what data is being "used" in the project without differentiating what existed before.
In case you need to make it explicit what data is reused then I would see the following options:
Set Dataset type to "Reused data" to encapsulate all data reused.
OR
Use description field of Dataset or Distribution to describe that data is reused.
OR
(inexplicit way) If the reused dataset comes from a data repository, then one could read out from the datasets metadata when it was published and compare it to the creation date of a DMP or starting date of a project (depends on setting) to find out whether a dataset existed before project has started or a DMP was written.
I think we should add a point on this to the FAQ.
It is correct what you describe about having multiple distributions for one dataset to indicate different things. For example, we can have a dataset with "survey data" that will have two distributions: "raw data" and "anonymised data". The first one is being used in processing during the project and will be deleted afterwards. The second one will be published at the end of a project.
I think any attempt to create an ontology around the possible uses/purposes of data must be firmly out of scope for this project. I recommend that we do not try to address this at all - except in an FAQ suggesting that "intended use" might be described in the description for a dataset or distribution.
The concept of a data set is slightly overloaded. I recognize data sets
Is the model supposed to be able to document each of these? Are they mostly differentiated based on the fact that the "product" data sets contain "distribution"s? It may be good to add something about this to the documentation.
The text was updated successfully, but these errors were encountered: