Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process latest DS-Connect export as unharmonized/pre-harmonized data #160

Closed
lopierra opened this issue Feb 28, 2024 · 3 comments
Closed

Comments

@lopierra
Copy link
Member

DS-Connect sent last export on 2/26.

  • uploaded to Synapse as-is (file_type = source_clinical and dictionary)
  • Fill out Study/Dataset template for unharmonized clinical dataset
  • check participant IDs and get dewrangle IDs (match to v3 as needed)
  • Clean up for use as unharmonized data (remove GUIDs, remove Acknolwedgments line, check for dates)
  • Determine how to deal with any redacted cols (e.g. GUID) in dictionary
@lopierra
Copy link
Member Author

Part of #159

@lopierra
Copy link
Member Author

DECISIONS:

Per convo w/ Sujata, 2024-04-11:

  • Leave in birth year (even though redundant)
  • Remove survey timestamp (full dates not allowed per DCC policy)
  • Delete zip code (unsure if Invitae did any cleaning of areas with <20k people;
    still useful with just state & country)
  • delete GUID from demo & IHQ

ID usage - per email thread 2024-04-24:

  • Data Hub UI: De-identified Patient IDs
  • Unharmonized data download: Original and de-identified Patient IDs
  • GUID mapping file: De-identified IDs (to match portal) and GUIDs

Unharmonized dictionary - strikethrough rows for the above and put [REDACTED] in the field name

@lopierra
Copy link
Member Author

Unharmonized demographic/IHQ data, dictionary, and dataset manifest are saved here and have been handed off to FHIR/portal teams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant