Support sparse matrix formats #1

daemontus · 2022-11-30T12:29:41Z

Hi!

Is there a plan to support some form of sparse matrix format in the future? For example, Cell Ranger typically stores RNA counts as rows gene_id cell_id count, where any combination of cell_id and gene_id with count=0 is omitted (and the file is sorted by gene_id, so it is easy to find data for a specific gene).

Of course, I can relatively easily transform this into a full count matrix, but needing to build the full matrix for large real-world datasets is a bit impractical. For example, a ~5GB .mtx file from Cell Ranger with ~80.000 cells can easily become a ~15GB .tsv file.

I think I'll be able to make a PR to support such sparse matrices, but if you're already working on something like that I don't want to step on your toes :)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support sparse matrix formats #1

Support sparse matrix formats #1

daemontus commented Nov 30, 2022

Support sparse matrix formats #1

Support sparse matrix formats #1

Comments

daemontus commented Nov 30, 2022