SortMeRNA: update version 4.3.6 #1316

gallardoalba · 2023-06-27T15:51:37Z

Main changes:

Log parameter removed (it is not available in the new version)
Merge/unmerge bash scripts replaced by equivaled functionalities (--out2)
New option included: --sout (separate paired and singleton aligned reads)
Indexdb_rna removed (included in the main function)
Include the possibility of using interleaved reads (-paired option)

bernt-matthias · 2023-08-22T14:25:27Z

Cool. One of my users just asked for an update. Can I help here?

gallardoalba · 2023-08-22T14:50:40Z

Cool. One of my users just asked for an update. Can I help here?

Tomorrow I'll continue working on it; I'll ping you if I find any problem.

gallardoalba · 2023-08-23T13:37:46Z

This is the problem that I found and that temporarily paralyzed the PR @bernt-matthias ; apparently Sortmerna generates an alignment in the temporary folder, and galaxy tries to index it without success, generating this error:

I tried to specify the path of this folder in order to provide an adequate extension, but I think it is not possible.

bernt-matthias · 2023-08-23T13:39:33Z

Can you check if the file is empty?

gallardoalba · 2023-08-23T14:19:49Z

Can you check if the file is empty?

You are right, this is indeed the problem. I'll try to find a better input file.

bgruening · 2023-08-31T00:37:51Z

. ERROR: Test 1: Found output tag with unknown name [output_fastx], valid names ['aligned', 'aligned_forward', 'aligned_reverse', 'aligned_forward_singleton', 'aligned_reverse_singleton', 'unaligned', 'unaligned_forward', 'unaligned_reverse', 'unaligned_forward_singleton', 'unaligned_reverse_singleton', 'output_bam', 'output_blast', 'output_biom', 'output_de_novo']
.. ERROR: Test 2: Found output tag with unknown name [output_fastx], valid names ['aligned', 'aligned_forward', 'aligned_reverse', 'aligned_forward_singleton', 'aligned_reverse_singleton', 'unaligned', 'unaligned_forward', 'unaligned_reverse', 'unaligned_forward_singleton', 'unaligned_reverse_singleton', 'output_bam', 'output_blast', 'output_biom', 'output_de_novo']
.. ERROR: Test 5: Found output tag with unknown name [aligned_paired], valid names ['aligned', 'aligned_forward', 'aligned_reverse', 'aligned_forward_singleton', 'aligned_reverse_singleton', 'unaligned', 'unaligned_forward', 'unaligned_reverse', 'unaligned_forward_singleton', 'unaligned_reverse_singleton', 'output_bam', 'output_blast', 'output_biom', 'output_de_novo']
.. ERROR: Test 5: Found output tag with unknown name [unaligned_paired], valid names ['aligned', 'aligned_forward', 'aligned_reverse', 'aligned_forward_singleton', 'aligned_reverse_singleton', 'unaligned', 'unaligned_forward', 'unaligned_reverse', 'unaligned_forward_singleton', 'unaligned_reverse_singleton', 'output_bam', 'output_blast', 'output_biom', 'output_de_novo']
.. CHECK: 9 test(s) found.
Applying linter output... CHECK
.. INFO: 14 outputs found.
Applying linter inputs... WARNING
.. WARNING: Param input [num_alignments] 'name' attribute is redundant if argument implies the same name.

For the linting. Sorry for such a messy tool :(

gallardoalba · 2023-08-31T06:47:29Z

. ERROR: Test 1: Found output tag with unknown name [output_fastx], valid names ['aligned', 'aligned_forward', 'aligned_reverse', 'aligned_forward_singleton', 'aligned_reverse_singleton', 'unaligned', 'unaligned_forward', 'unaligned_reverse', 'unaligned_forward_singleton', 'unaligned_reverse_singleton', 'output_bam', 'output_blast', 'output_biom', 'output_de_novo']
.. ERROR: Test 2: Found output tag with unknown name [output_fastx], valid names ['aligned', 'aligned_forward', 'aligned_reverse', 'aligned_forward_singleton', 'aligned_reverse_singleton', 'unaligned', 'unaligned_forward', 'unaligned_reverse', 'unaligned_forward_singleton', 'unaligned_reverse_singleton', 'output_bam', 'output_blast', 'output_biom', 'output_de_novo']
.. ERROR: Test 5: Found output tag with unknown name [aligned_paired], valid names ['aligned', 'aligned_forward', 'aligned_reverse', 'aligned_forward_singleton', 'aligned_reverse_singleton', 'unaligned', 'unaligned_forward', 'unaligned_reverse', 'unaligned_forward_singleton', 'unaligned_reverse_singleton', 'output_bam', 'output_blast', 'output_biom', 'output_de_novo']
.. ERROR: Test 5: Found output tag with unknown name [unaligned_paired], valid names ['aligned', 'aligned_forward', 'aligned_reverse', 'aligned_forward_singleton', 'aligned_reverse_singleton', 'unaligned', 'unaligned_forward', 'unaligned_reverse', 'unaligned_forward_singleton', 'unaligned_reverse_singleton', 'output_bam', 'output_blast', 'output_biom', 'output_de_novo']
.. CHECK: 9 test(s) found.
Applying linter output... CHECK
.. INFO: 14 outputs found.
Applying linter inputs... WARNING
.. WARNING: Param input [num_alignments] 'name' attribute is redundant if argument implies the same name.

For the linting. Sorry for such a messy tool :(

Now should be fine; some scripts were removed (e.g. merge-paired-reads.sh and unmerge-paired-reads.sh), and replaced by equivalent functionalities.

tools/rna_tools/sortmerna/sortmerna.xml

bgruening · 2023-08-31T07:43:28Z

@gallardoalba a profile version enables an own HOME dir for every job.

gallardoalba · 2023-08-31T07:44:09Z

@gallardoalba a profile version enables an own HOME dir for every job.

Perfect, thanks for including it.

bernt-matthias

Quite a large update. Good work. Here a few comments from my side.

tools/rna_tools/sortmerna/sortmerna.xml

bernt-matthias · 2023-08-31T07:20:07Z

tools/rna_tools/sortmerna/sortmerna.xml

+ </conditional>
+ <param name="strand_search" value="" />
+ <conditional name="databases_type">
+ <param name="databases_selector" value="history" />


Would be great to have a test for the cached case.

Yes, I need to review it, because neither in the previous version nor in the update the test datatables seem to be available.

tools/rna_tools/sortmerna/macros.xml

bernt-matthias · 2023-08-31T07:49:55Z

tools/rna_tools/sortmerna/macros.xml

+ $ref.append('%s' % $db )
+ #end for
+ #else
+ #for $db in $databases_type.input_databases.fields.path.split(",")


So the idea is here that a single entry is selected and split by comma? How does this look like in practice?

Wondering if making the cached case multiple="true" would be an option? Might be more flexible (backward compatibility might be a bit tricky .. but not very)?

tools/rna_tools/sortmerna/sortmerna.xml

bernt-matthias · 2023-08-31T07:57:30Z

tools/rna_tools/sortmerna/sortmerna.xml

+ #for $i, $reference in enumerate($ref)
+ --ref '$reference'
+ #end for
+ #if str( $databases_type.databases_selector ) != 'cached'


Regarding the cached / cached_to_index question. This seems to happen now in sortmerna.

But its still not clear how this works in practice, because for both options the user selects from the same data table?

Also I cannot completely understand. Reference FASTA and indices can be provided in that way ./sortmerna --ref ./rRNA_databases/silva-bac-16s-id90.fasta,./index/silva-bac-16s-db

Perhaps @bebatut could have a look, since she wrote the datamanager. Otherwise I would propose to keep this section without changes. I think it should still work properly.

tools/rna_tools/sortmerna/sortmerna.xml

tools/rna_tools/sortmerna/test-data/test2_log.txt

gallardoalba · 2023-09-05T12:50:19Z

Do you think it could be merged @bernt-matthias? I would like to test if it works with the installed indexed genomes.

bernt-matthias · 2023-09-05T14:47:07Z

I would like to test if it works with the installed indexed genomes.

Would be cool to have a test. Let me know if the PR is ready from your side and I will review and merge.

gallardoalba · 2023-09-21T13:27:35Z

I would like to test if it works with the installed indexed genomes.

Would be cool to have a test. Let me know if the PR is ready from your side and I will review and merge.

Hi @bernt-matthias, I'm trying to create the test for the database, but I'm not sure how to create the file structure. According this https://github.com/bgruening/galaxytools/blob/master/data_managers/data_manager_sortmerna_database_downloader/data_manager/data_manager_sortmerna_download.py#L122 it seems to be fine, but don't know why the tool is not able to recognize it. Would you mind to have a look? Thanks a lot!

bernt-matthias · 2023-10-16T12:18:27Z

Hi @gallardoalba what is the state here?

Would you mind to have a look?

What exactly should I look at? Is there a failing test that I could examine?

bernt-matthias · 2023-10-18T14:53:55Z

Will add a test for cached data. Wondering if the loops are correct, i.e. in

galaxytools/tools/rna_tools/sortmerna/sortmerna.xml

Line 70 in dfa4414

#for $db in $databases_type.input_databases.fields.path.split(",")

we loop over a list derived from a comma separated string. But actually we have a select with multiple="true".

otherwise tests do not use container

bernt-matthias · 2023-10-18T15:54:10Z

I get the impression that the use of (multiple?) cached references was already wrong in 2.1. But I guess most of the time a single one is used. The docs state

      --ref             STRING,STRING   FASTA reference file, index file                               mandatory
                                         (ex. --ref /path/to/file1.fasta,/path/to/index1)
                                         If passing multiple reference files, separate 
                                         them using the delimiter ':',
                                         (ex. --ref /path/to/file1.fasta,/path/to/index1:/path/to/file2.fasta,path/to/index2)

But Galaxy just executes with --ref REF1,REF2,REF3,....

Also the indexdb (actually indexdb_rna) executed with the datamanager is not used anymore. I guess we can / should ignore the indexes created by the data manager.

tool writes to `$output_bam`

bernt-matthias · 2023-10-18T18:21:14Z

Hi @bgruening .. I was still fixing bugs and adding tests wrt refereces. I stopped CI, but feel free to restart if you need the current state.

I will open a followup PR.

followup on bgruening#1316 which was not deployed - use the same chached data for chached and cached_to_index i.e. now they differ only in that the later uses the dbprep macro if I get it right previously the datamanager precomputed indexes which could be used. this seems not possible anymore I suggest to leave the dm untouched (than the provided data will also work for old sortmerna versions) - fix usage of cached data (did not work for multiple provided values) and add tests

* sortmerna: finish update followup on #1316 which was not deployed - use the same chached data for chached and cached_to_index i.e. now they differ only in that the later uses the dbprep macro if I get it right previously the datamanager precomputed indexes which could be used. this seems not possible anymore I suggest to leave the dm untouched (than the provided data will also work for old sortmerna versions) - fix usage of cached data (did not work for multiple provided values) and add tests * add missing test file * also chached references are not optional selects default to optional="true" which should not apply here. also checkboxes do not work therefore. * eliminate cached_to_index option

gallardoalba added 4 commits June 23, 2023 15:05

Update_files

c65dc08

Merge branch 'master' of https://github.com/bgruening/galaxytools

6e8bda6

First commit

6313124

Remove file

603743a

gallardoalba added 3 commits August 24, 2023 17:02

Update files

9601808

Update PR

27d51fb

Update wrapper

dcc2a4f

gallardoalba marked this pull request as ready for review August 30, 2023 17:11

gallardoalba added 3 commits August 30, 2023 19:23

Remove empty files

7c40e16

Change extension

31a15b9

Remove unnecesary file

47d13c8

gallardoalba requested review from bebatut and bernt-matthias August 30, 2023 17:25

gallardoalba added 2 commits August 31, 2023 08:43

Update tests

8ec6e35

Update wrapper

e029e31

bgruening reviewed Aug 31, 2023

View reviewed changes

tools/rna_tools/sortmerna/sortmerna.xml Outdated Show resolved Hide resolved

gallardoalba and others added 4 commits August 31, 2023 08:52

Remove comments

4207dd3

Update wrapper

c766d44

Include KVDB path

4f0e976

use a profile version

eb99f5d

bernt-matthias reviewed Aug 31, 2023

View reviewed changes

gallardoalba added 8 commits September 1, 2023 14:17

Modify tests

91df30f

Remove extra output

6834ee7

Add file

634461f

Update test

fcf9c98

Add changes

673f07d

Fix error

d1a2afb

Add ftype

2ea8802

Increse line_diff

7a2786f

bgruening approved these changes Sep 5, 2023

View reviewed changes

Add_database

996203a

sortmerna: add test for data manager

9036288

bernt-matthias added 3 commits October 18, 2023 16:55

remove yield

0f86f66

fix python linting

c7a0985

add profile to data manager

1de1e69

otherwise tests do not use container

bernt-matthias added 3 commits October 18, 2023 17:57

more linter fixes

847afec

fix bam output

aa8100f

tool writes to `$output_bam`

simplify BAM sorting

d622e47

bgruening approved these changes Oct 18, 2023

View reviewed changes

bgruening merged commit 0e5cc6f into bgruening:master Oct 18, 2023
9 of 11 checks passed

bernt-matthias mentioned this pull request Oct 19, 2023

sortmerna: finish update #1338

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SortMeRNA: update version 4.3.6 #1316

SortMeRNA: update version 4.3.6 #1316

gallardoalba commented Jun 27, 2023 •

edited

Loading

bernt-matthias commented Aug 22, 2023

gallardoalba commented Aug 22, 2023

gallardoalba commented Aug 23, 2023 •

edited

Loading

bernt-matthias commented Aug 23, 2023

gallardoalba commented Aug 23, 2023

bgruening commented Aug 31, 2023

gallardoalba commented Aug 31, 2023

bgruening commented Aug 31, 2023

gallardoalba commented Aug 31, 2023

bernt-matthias left a comment

bernt-matthias Aug 31, 2023

gallardoalba Aug 31, 2023

bernt-matthias Aug 31, 2023

bernt-matthias Aug 31, 2023

gallardoalba Aug 31, 2023 •

edited

Loading

gallardoalba commented Sep 5, 2023

bernt-matthias commented Sep 5, 2023

gallardoalba commented Sep 21, 2023

bernt-matthias commented Oct 16, 2023

bernt-matthias commented Oct 18, 2023

bernt-matthias commented Oct 18, 2023

bernt-matthias commented Oct 18, 2023

SortMeRNA: update version 4.3.6 #1316

SortMeRNA: update version 4.3.6 #1316

Conversation

gallardoalba commented Jun 27, 2023 • edited Loading

bernt-matthias commented Aug 22, 2023

gallardoalba commented Aug 22, 2023

gallardoalba commented Aug 23, 2023 • edited Loading

bernt-matthias commented Aug 23, 2023

gallardoalba commented Aug 23, 2023

bgruening commented Aug 31, 2023

gallardoalba commented Aug 31, 2023

bgruening commented Aug 31, 2023

gallardoalba commented Aug 31, 2023

bernt-matthias left a comment

Choose a reason for hiding this comment

bernt-matthias Aug 31, 2023

Choose a reason for hiding this comment

gallardoalba Aug 31, 2023

Choose a reason for hiding this comment

bernt-matthias Aug 31, 2023

Choose a reason for hiding this comment

bernt-matthias Aug 31, 2023

Choose a reason for hiding this comment

gallardoalba Aug 31, 2023 • edited Loading

Choose a reason for hiding this comment

gallardoalba commented Sep 5, 2023

bernt-matthias commented Sep 5, 2023

gallardoalba commented Sep 21, 2023

bernt-matthias commented Oct 16, 2023

bernt-matthias commented Oct 18, 2023

bernt-matthias commented Oct 18, 2023

bernt-matthias commented Oct 18, 2023

gallardoalba commented Jun 27, 2023 •

edited

Loading

gallardoalba commented Aug 23, 2023 •

edited

Loading

gallardoalba Aug 31, 2023 •

edited

Loading