Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.gz from Data Registry displays error #353

Open
sorenabell opened this issue Sep 9, 2021 · 5 comments
Open

.gz from Data Registry displays error #353

sorenabell opened this issue Sep 9, 2021 · 5 comments
Labels
bug Something isn't working
Milestone

Comments

@sorenabell
Copy link
Contributor

/data/storage/exporter_dumps/australia_nsw/155/full.jsonl.gz - flatten tool complains about the file format

@sorenabell sorenabell added this to the 2.0 milestone Sep 9, 2021
@jpmckinney
Copy link
Member

jpmckinney commented Sep 23, 2021

From Sep 16 call, I think Quinta was going to propose the text for the error message, for OCP to review – assuming this is about inconsistent data types across different releases.

@jpmckinney
Copy link
Member

jpmckinney commented Sep 24, 2021

My thinking is that we might need additional processing steps in Kingfisher Process to perform corrections for known problems (e.g. normalizing a field so that it is always an array, or always an object, or always a literal value). Like for other steps (like check), we can configure the Kingfisher Collect spider to opt-in to the check.

In the meantime, we just need a clear error message, prompting the user to contact us, so that we can address the error. We should also log something, so that we can check the logs for which datasets encountered this error.

FYI @yolile

@sorenabell
Copy link
Contributor Author

@jpmckinney The proposed mesage is as follows:

This file is not compliant with the OCDS schema so cannot be flattened. Check your data using the Data Review Tool and resolve the issues before flattening.

In case we are able to define the source of the issue, we offer to add one more line specifying it in the message:

This file is not compliant with the OCDS schema so cannot be flattened. Check your data using the Data Review Tool and resolve the issues before flattening. The issue occurred during processing of the item with ocds-03ad3f-391507-1 ocid.

@yolile
Copy link
Member

yolile commented Oct 5, 2021

Currently, the flatten tool is failing for all the datasets that I tried from the data registry. (eg https://flatten.open-contracting.org/#/upload-file?lang=es&url=6f6b59d0-b2e5-4641-94f2-d37b2da5c13a, https://flatten.open-contracting.org/#/upload-file?lang=es&url=a8409b70-b4e3-4d55-a919-9d51d57d292f) The logged error is the same as the one reported in open-contracting/spoonbill#195

@jpmckinney
Copy link
Member

@jpmckinney The proposed mesage is as follows:

This file is not compliant with the OCDS schema so cannot be flattened. Check your data using the Data Review Tool and resolve the issues before flattening.

In case we are able to define the source of the issue, we offer to add one more line specifying it in the message:

This file is not compliant with the OCDS schema so cannot be flattened. Check your data using the Data Review Tool and resolve the issues before flattening. The issue occurred during processing of the item with ocds-03ad3f-391507-1 ocid.

Looks good, just change the second version to end with ocid '{ocid}' instead of {ocid} ocid.

@jpmckinney jpmckinney added the bug Something isn't working label Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants