Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore use of Datalad for storing datasets #18

Open
trevorb1 opened this issue Mar 28, 2023 · 0 comments
Open

Explore use of Datalad for storing datasets #18

trevorb1 opened this issue Mar 28, 2023 · 0 comments

Comments

@trevorb1
Copy link
Member

Exploring the use of Datalad for the pulling of large datasets and versioning of the datasets might be useful. I am not totally sure if Datalad perfectly aligns with our use-case, but I think it is still worth exploring.

This site gives an overview of how Datalad can work with git-annex, and specifically, this section of the site gives an overview of how to "publish a dataset on GitHub with publicly-accessible annexed files" (with the key being, these files are not downloaded locally automatically). We still need a place to store files, but this may ease the process for large datasets.

More information on Datalad can be found here:
Website: https://www.datalad.org/
GitHub: https://github.com/datalad/datalad
Documentation: http://handbook.datalad.org/en/latest/index.html
Introduction presentation: https://training.westdri.ca/materials/datalad_for_hpc_1_1.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant