Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

summary_identify.csv is missing #3

Open
Edison2021 opened this issue Oct 28, 2021 · 7 comments
Open

summary_identify.csv is missing #3

Edison2021 opened this issue Oct 28, 2021 · 7 comments

Comments

@Edison2021
Copy link

Edison2021 commented Oct 28, 2021

Hi Alex
I run a trial test and found an error as below:
Command
python3.7 CRISPRloci_standalone.py -f NC_005230.fasta -output NC_005230.fasta.dir -st dna -cpu 32
Final output
summary crispr Example/NC_005230.fasta.dir/summary_crisp.csv
dirname cas Example/NC_005230.fasta.dir/tmp/output-Casboundary/predictions/
This file does not exist: Example/NC_005230.fasta.dir/summary_identify.csv

Best
Edison

@Edison2021
Copy link
Author

In addition, there is another error reported:
Error: Unable to access jarfile CRISPRloci_webserver_visualization/CRISPRloci_visualization.jar

I searched CRISPRloci folder but did not find the folder CRISPRloci_webserver_visualization.

Best
Edison

@niccw
Copy link

niccw commented Mar 14, 2022

@Edison2021

I also encountered the same error. After checking, one of the biggest problems is that environments.yml miss packages that are required for CRISPRidentify.

@Alexander-Mitrofanov
Copy link
Collaborator

Thank you for submitting the problems.
The fixes will be applied at the beginning of April.
For the time being please try to use the CRISPRloci web interface.

@JPegorino
Copy link

just in case it is helpful for bug fixes, I'm getting (I think) related errors with this at the end of April (running on a complete reference genome from NCBI) and I thought I'd copy them here:

cp: cannot stat '/home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/tmp/output-CRISPRidentify/GCA_000008485/CRISPR*': No such file or directory
cp: cannot stat '/home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/tmp/output-CRISPRidentify/GCA_000008485/Spacers*': No such file or directory
This file does not exist: /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/tmp/output-Casboundary/predictions/
Error: Unable to access jarfile /home/ubuntu/software/CRISPRloci-1.0.0/CRISPRloci_webserver_visualization/CRISPRloci_visualization.jar
summary crispr /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/summary_crisp.csv
dirname cas /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/tmp/output-Casboundary/predictions/
summary crispr /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/summary_crisp.csv
dirname cas /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/tmp/output-Casboundary
This file does not exist: /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/summary_identify.csv

Many thanks,
Jamie

@Alexander-Mitrofanov
Copy link
Collaborator

Thank you for the feedback. I'm starting to work on the fixes.

@pjbiggs
Copy link

pjbiggs commented Jul 4, 2022

Hi,
I am wanting to run the code on hundreds of genomes. I am having a similar issue to those above, but I also note additional problems. When I ran the command for the first time, the 4 .tar.gz files were extracted for CASboundary. It did not find them automatically for CRISPRcasIdentifier. I extracted them manually and now have trained_models.tar.gz and HMM_sets.tar.gz in the CRISPRcasIdentifier folder along with their extracted folders. I also have the extracted folders in the root CRISPRloci-1.0.0 folder.
I am running the test command: python3.7 CRISPRloci_standalone.py -f Example/NC_005230.fasta -st dna -output test1 into a new results folder. My errors are as below:

`1. Run initial array detection
2. Refine detected arrays
3. Evaluate candidates
4. Enhance evaluated arrays
5. Complement arrays with additional info
Traceback (most recent call last):
File "components/module_non_array_computations.py", line 164, in _calculate_strand
st = StrandComputationNew(list_of_crisprs=self.list_of_crisprs_bona_fide)
File "components/components_non_array_computations.py", line 111, in init
self._compute_all_strands()
File "components/components_non_array_computations.py", line 134, in _compute_all_strands
with open("ResultsStrand/CRISPRstrand_Summary.tsv", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'ResultsStrand/CRISPRstrand_Summary.tsv'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "CRISPRidentify.py", line 249, in
run_over_one_file(complete_path_file, folder_result, pickle_folder)
File "CRISPRidentify.py", line 210, in run_over_one_file
flag_dev_mode=FLAG_DEVELOPER_MODE)
File "components/pipeline.py", line 32, in init
self._run_non_crispr_computation()
File "components/pipeline.py", line 75, in _run_non_crispr_computation
flag_dev_mode=self.flag_dev_mode)
File "components/module_non_array_computations.py", line 34, in init
self._calculate_all_non_array_values()
File "components/module_non_array_computations.py", line 46, in _calculate_all_non_array_values
self._calculate_strand()
File "components/module_non_array_computations.py", line 171, in _calculate_strand
st = StrandComputation(list_of_crisprs=self.list_of_crisprs_bona_fide)
File "components/components_non_array_computations.py", line 93, in init
self._compute_all_strands()
File "components/components_non_array_computations.py", line 98, in _compute_all_strands
strand = get_orientation(consensus)
File "components/components_non_array_computations.py", line 48, in get_orientation
f = open("prediction", "r")
FileNotFoundError: [Errno 2] No such file or directory: 'prediction'
Error: Unable to access jarfile /home/pbiggs/software/CRISPRloci-1.0.0/CRISPRloci_webserver_visualization/CRISPRloci_visualization.jar
summary crispr /home/pbiggs/software/CRISPRloci-1.0.0/test1/summary_crisp.csv
dirname cas /home/pbiggs/software/CRISPRloci-1.0.0/test1/tmp/output-Casboundary/predictions/
This file does not exist: /home/pbiggs/software/CRISPRloci-1.0.0/test1/summary_identify.csv`

I am running this in WSL2 using Ubuntu 20.04, and I have no issues with conda environments.
Any thoughts please on how to solve this?
Thanks,
Patrick

@Old-Green-Man
Copy link

Old-Green-Man commented Aug 9, 2022

Regarding the Readme example, as pjbiggs mentioned above, the tar archives in CRISPRcasIdentifier are not being auto extracted as indicated in the Readme.

Also, the example:

python3.7 CRISPRloci_standalone.py -f Example/NC_005230_proteins.fasta -st protein

should be

python3.7 CRISPRloci_standalone.py -f Example/NC_005230_proteins.fa -st protein

and I'm not seeing Example/Input3.fa for the virus example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants