How does "merge_kingdoms" mentioned in the "build_db_and_run.sh" script work? #31

ecalfapietra · 2023-03-16T09:09:37Z

Hello,

I'm trying to use the STAT tools to build a database from fasta sequences, and then using it to do metagenomics/taxonomic analyses.
So I'm following the tutorial in the build_db_and_run.sh script.
It says that we can do the identify_tax_ids part in multiple instances, but if we do, we have to use the tool called merge_kingdoms to combine results into a single file.
My problem is that there is no informations about the use of this tool.
The help of the tool is : need <tax.parents>
I don't understand what I should put in each argument (except for tax.parents).

Also, I'm using the default parameters :
KMER_LEN=32
DENSE_WINDOW=4 # 1 kmer of 4 for dense db (just for example)
SPARSE_WINDOW=128 # 1 kmer of 128 for sparse db (just for example)
But I don't know if I really should ?

Same question for MAX_KMER_DICTIONARY_SIZE=5000000 # This number should be roughly as max kmers expected * 2.
I don't really know how I could know the maximum number of kmers expected.

Thank you in advance !

tolot27 · 2024-05-13T08:59:00Z

@ecalfapietra Did you build your db sucessfully? I'm looking for the same parameters to build a refeq k-mer db.

multikengineer self-assigned this May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does "merge_kingdoms" mentioned in the "build_db_and_run.sh" script work? #31

How does "merge_kingdoms" mentioned in the "build_db_and_run.sh" script work? #31

ecalfapietra commented Mar 16, 2023

tolot27 commented May 13, 2024

How does "merge_kingdoms" mentioned in the "build_db_and_run.sh" script work? #31

How does "merge_kingdoms" mentioned in the "build_db_and_run.sh" script work? #31

Comments

ecalfapietra commented Mar 16, 2023

tolot27 commented May 13, 2024