Skip to content

JSON output file

martinghunt edited this page Jan 24, 2022 · 2 revisions

This page describes the contents of the file run_info.json, which is made when running viridian assemble. It has two main sections: run_summary and amplicons.

run_summary

This contains high level information on the run. It looks like this:

"run_summary": {
    "command": "viridian assemble --bam reads.bam illumina ref.fasta amplicons.json outdir",
    "consensus": "AGATCT .... etc",
    "cwd": "/path/where/viridian/was/run",
    "end_time": "2021-12-13T10:54:42",
    "finished_running": true,
    "hostname": "myhost",
    "made_consensus": true,
    "options": {
      "amplicons_json": "amplicons.json",
      ... all other command line options ...
    },
    "run_time": "0:00:20.643091",
    "start_time": "2021-12-13T10:54:21",
    "successful_amplicons": 97,
    "total_amplicons": 98,
    "version": "0.1.0",
    "consensus_length": 30000,
    "consensus_N_count": 42,
    "amplicon_success": {
        "amplicon1": true,
        "amplicon2": false,
        "amplicon3": true,
        ... etc for each amplicon ...
    } 
}

Most of the contents should be self-explanatory. Here is an explanation of some of the entries:

  • finished_running - this is set to false at the start of the run. It is set to true at the end of the pipeline. If you see false in there then something went wrong and there was likely an error message in your terminal.
  • made_consensus - this is true or false, depending on whether a final consensus sequence was successfully made.
  • consensus - this is the final consensus sequence made by Viridian. It is the same as the sequence written to the file consensus.final_assembly.fa.
  • successful_amplicons - number of amplicons for which Viridian made a consensus sequence, from the number of total amplicons total_amplicons.

amplicons

The amplicons section of the JSON output is a list of dictionaries - one dictionary for each amplicon. One amplicon entry looks like this:

{
  "assemble_success": true,
  "start": 31
  "end": 410,
  "left_primer_length": 24,
  "right_primer_length": 25,
  "name": "nCoV-2019_1_pool1",
  "polish_data": {
    "Comments": [],
    "Coverage for polishing": 151.43,
    "Polish success": true,
    "Reads matching": 1278,
    "Reads matching forward strand": 640,
    "Reads matching reverse strand": 638,
    "Reads used": 192
  },
  "polished_seq": "ACCAAC ... etc",
  "polished_masked_seq": "ACCAAN ... etc",
},

The important entries (the rest are really for debugging) are:

  • assemble_success - true if a consensus sequence was successfully made
  • polished_masked_seq - this is the final consensus sequence that is used in the assembly pipeline. The polished masked sequences from all amplicons are stitched together to make the final consensus sequence.
Clone this wiki locally