Is it suitable for whole gene? #37

326reborn · 2023-09-22T09:38:24Z

I wonder if it's suitable for full sequence?
When I test the test_data and change it to amino acid genotypes, it works well. However, when I elongate the test sequence to 242 aa, it can't work.

And I got error at the step of SequenceSpace():
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 14.2 GiB for an array with shape (1912602624,) and data type float64

I hope to use it to analyse random mutantions on a 239 amino acids long gene.

lperezmo · 2023-09-22T10:50:31Z

A simple way to get around that would be to use a system with more RAM, since it looks like the program tried to save the whole array into memory and it ran out of space. Another option would be to store the array on disk instead of memory using numpy's memmap. If you could share the code you used that might be useful to see where something like that could be implemented. using float32 for the array is another option

326reborn · 2023-10-07T07:20:38Z

Hello Morales,
Thanks for your reply! I just use the code to get space:

import pandas as pd
import numpy as np
import seaborn as sns
import holoviews as hv

import gpmap.src.plot as plot

from gpmap.src.inference import VCregression
from gpmap.src.space import SequenceSpace
from gpmap.src.randwalk import WMWSWalk

data=pd.read_csv('test_data.csv',sep=',',header=0)
space = SequenceSpace(X=data['genotypes'].values, y=data['phenotypes'].values)

test_data.csv

I think it might be the long sequence caused the memory overflow.
By the way, some packages can't be imported because the version of my scipy is 1.10. I tried existing version of scipy and it still can't work. So I changed the code in space.py and other scripts like 'from scipy.sparse import csr_matrix'. I'm not sure if this will lead to bug.

Best,
Yu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it suitable for whole gene? #37

Is it suitable for whole gene? #37

326reborn commented Sep 22, 2023

lperezmo commented Sep 22, 2023

326reborn commented Oct 7, 2023

Is it suitable for whole gene? #37

Is it suitable for whole gene? #37

Comments

326reborn commented Sep 22, 2023

lperezmo commented Sep 22, 2023

326reborn commented Oct 7, 2023