You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is there a plan to support some form of sparse matrix format in the future? For example, Cell Ranger typically stores RNA counts as rows gene_id cell_id count, where any combination of cell_id and gene_id with count=0 is omitted (and the file is sorted by gene_id, so it is easy to find data for a specific gene).
Of course, I can relatively easily transform this into a full count matrix, but needing to build the full matrix for large real-world datasets is a bit impractical. For example, a ~5GB .mtx file from Cell Ranger with ~80.000 cells can easily become a ~15GB .tsv file.
I think I'll be able to make a PR to support such sparse matrices, but if you're already working on something like that I don't want to step on your toes :)
The text was updated successfully, but these errors were encountered:
Hi!
Is there a plan to support some form of sparse matrix format in the future? For example, Cell Ranger typically stores RNA counts as rows
gene_id cell_id count
, where any combination ofcell_id
andgene_id
withcount=0
is omitted (and the file is sorted bygene_id
, so it is easy to find data for a specific gene).Of course, I can relatively easily transform this into a full count matrix, but needing to build the full matrix for large real-world datasets is a bit impractical. For example, a ~5GB
.mtx
file from Cell Ranger with ~80.000 cells can easily become a ~15GB.tsv
file.I think I'll be able to make a PR to support such sparse matrices, but if you're already working on something like that I don't want to step on your toes :)
The text was updated successfully, but these errors were encountered: