You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository hosts the feature vector representations of the image data set used for similarity search. The resulting HDF5 files are orders of magnitude more compact than storing the raw images. As such, we should move the scripts/notebooks for downloading raw images to this repository.
Make the Image feature vector download a separate process from checking out the repository / starting the docker image.
Allow the user to point to the location of the feature vectors (e.g. a different location on disk, a location in the Docker container).
Q: Why?
A: Because users might want to utilize different parts of the pipeline, like sequencing analysis, that shouldn't require downloading the gigabytes of feature vector data.
Open question: How do we want the similarity search repositories to access the feature vectors? What should we recommend to users checking out the repository who want to reproduce our results (perhaps even without downloading all the images from scratch)?
1 - Git submodules
2 - Manually specify locations (requires extra steps of user checking out repository, running git lfs, etc)
3 - As part of this pipeline, publish to external bucket
4 - Other approaches?
This repository hosts the feature vector representations of the image data set used for similarity search. The resulting HDF5 files are orders of magnitude more compact than storing the raw images. As such, we should move the scripts/notebooks for downloading raw images to this repository.
Here are the repositories that use this:
High Level
Low Level
Open Images
Hybridization Similarity Search
notebooks/01_datasets/01_download.ipynb
notebooks/01_datasets/02_extract_features.ipynb
Cas9 Similarity Search
notebooks/01_datasets/01_download.ipynb
to Open Imagesnotebooks/01_datasets/02_extract_features.ipynb
to Open Imagesdocker.sh
andDockerfile
to Open imagesThe text was updated successfully, but these errors were encountered: