Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sparse matrix formats #1

Open
daemontus opened this issue Nov 30, 2022 · 0 comments
Open

Support sparse matrix formats #1

daemontus opened this issue Nov 30, 2022 · 0 comments

Comments

@daemontus
Copy link

Hi!

Is there a plan to support some form of sparse matrix format in the future? For example, Cell Ranger typically stores RNA counts as rows gene_id cell_id count, where any combination of cell_id and gene_id with count=0 is omitted (and the file is sorted by gene_id, so it is easy to find data for a specific gene).

Of course, I can relatively easily transform this into a full count matrix, but needing to build the full matrix for large real-world datasets is a bit impractical. For example, a ~5GB .mtx file from Cell Ranger with ~80.000 cells can easily become a ~15GB .tsv file.

I think I'll be able to make a PR to support such sparse matrices, but if you're already working on something like that I don't want to step on your toes :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant