Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with scvelo #27

Open
panapapa14 opened this issue Oct 10, 2024 · 0 comments
Open

Issues with scvelo #27

panapapa14 opened this issue Oct 10, 2024 · 0 comments

Comments

@panapapa14
Copy link

Hello there. It has been more than a month now that I have been trying to run scvelo without positive outcome. The script that I use (after numerous trials) is the following:

import dask.array as da
import scanpy as sc
import pandas as pd
import matplotlib.pyplot as plt
import scvelo as scv
spliced = sc.read_mtx("/data/ppapadopoulos/scvelo/spliced_reads/spliced.mtx")
spliced.var_names = pd.read_csv("/data/ppapadopoulos/scvelo/spliced_reads/genes.txt", header=None)[0]
spliced.obs_names = pd.read_csv("/data/ppapadopoulos/scvelo/spliced_reads/barcodes.txt", header=None)[0]
unspliced = sc.read_mtx("/data/ppapadopoulos/scvelo/unspliced_reads/unspliced.mtx")
unspliced.var_names = pd.read_csv("/data/ppapadopoulos/scvelo/unspliced_reads/genes.txt", header=None)[0]
unspliced.obs_names = pd.read_csv("/data/ppapadopoulos/scvelo/unspliced_reads/barcodes.txt", header=None)[0]
common_genes = spliced.var_names.intersection(unspliced.var_names)
common_obs = spliced.obs_names.intersection(unspliced.obs_names)
spliced = spliced[common_obs, :][:, common_genes]
unspliced = unspliced[common_obs, :][:, common_genes]
adata = sc.AnnData(X=spliced.X, var=spliced.var, obs=spliced.obs)
adata.layers['unspliced'] = unspliced.X
adata.layers['spliced'] = spliced.X
adata = adata[::100, :]
scv.pp.filter_and_normalize(adata, min_counts=1, min_counts_u=1, min_cells=1, log=True)
sc.pp.neighbors(adata, n_neighbors=10, n_pcs=20)
scv.pp.moments(adata, n_pcs=30, n_neighbors=30)
scv.tl.recover_dynamics(adata, n_jobs=8)

I have access to a pc with 132GB RAM and 32 threads, although I try to find a memory efficient way of executing the process (several memory related errors). My main issues are obviously related to the last 3 commands. When I remove the duplicates from my samples, I finally get velocity vectors that are equal to zero. If I don't remove them, I get a corrupted neighbor graph that messes up the rest of the process. Even when everything seems to work, the scv.tl.recovery_dynamics fails after trying to be executed overnight.. If you have any advice, I would really appreciate it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant