Skip to content

Commit

Permalink
📚 update docs for HLA T1k
Browse files Browse the repository at this point in the history
  • Loading branch information
dmiller15 committed Jun 11, 2024
1 parent 3620e9b commit 68683a0
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 1 deletion.
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,13 @@ Kids First favors setting/overriding defaults with "arriba-heavy" specified in [
rmats_ram: {type: 'int?', doc: "GB of RAM to allocate to RMATs."}
```
### T1k
```yaml
run_t1k: { type: 'boolean?', default: true, doc: "Set to false to disable T1k HLA typing" }
hla_rna_ref_seqs: { type: 'File?', doc: "FASTA file containing the HLA allele reference sequences for RNA." }
hla_rna_gene_coords: { type: 'File?', doc: "FASTA file containing the coordinates of the HLA genes for RNA." }
```
### Run:
1) Reads inputs:
Expand Down Expand Up @@ -252,6 +259,8 @@ groups"`. See the STAR documentation on `outSAMattrRGline` for complete details.
- `RSEMgenome`: RSEM_GENCODE39.tar.gz, built using the `reference_fasta` and `gtf_anno`, following `GENCODE` instructions from [here](https://deweylab.github.io/RSEM/README.html), then creating a tar ball of the results.
- `STARgenome`: STAR_2.7.10a_GENCODE39.tar.gz, created using the star_2.7.10a_genome_generate.cwl tool, using the `reference_fasta`, `gtf_anno`, and setting `sjdbOverhang` to 100
- `kallisto_idx`: RSEM_GENCODE39.transcripts.kallisto.idx, built from RSEM GENCODE 39 transcript fasts, in `RSEMgenome` tar ball, following instructions from [here](https://pachterlab.github.io/kallisto/manual)
- `hla_rna_ref_seqs`: hla_v3.43.0_gencode_v39_rna_seq.fa, created using https://github.com/mourisl/T1K/blob/master/t1k-build.pl with [hla.dat v3.43.0](http://ftp.ebi.ac.uk/pub/databases/ipd/imgt/hla/hla.dat) and [GENCODE v39 primary assembly GTF](https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_39/gencode.v39.primary_assembly.annotation.gtf.gz)
- `hla_rna_gene_coords`: hla_v3.43.0_gencode_v39_rna_coord.fa, created using https://github.com/mourisl/T1K/blob/master/t1k-build.pl with [hla.dat v3.43.0](http://ftp.ebi.ac.uk/pub/databases/ipd/imgt/hla/hla.dat) and [GENCODE v39 primary assembly GTF](https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_39/gencode.v39.primary_assembly.annotation.gtf.gz)

6) rMATS requires the length of the reads in the sample. This workflow will attempt to estimate the read length based on a polling of reads. If the user wishes to override this value they can set `read_length_median` to their desired read length. Additionally, there is a `rmats_variable_read_length` boolean that users can set if their reads are not uniform in length. This workflow will poll the reads and set that value to true if it observes multiple read lengths. Like read length, user-provided input will override this guess.

Expand Down Expand Up @@ -293,6 +302,7 @@ groups"`. See the STAR documentation on `outSAMattrRGline` for complete details.
doc: "Retained introns JC.txt output from RMATs containing only those calls with 10 or more junction spanning read counts of support"}
rmats_filtered_skipped_exons_jc: {type: 'File', outputSource: rmats/filtered_skipped_exons_jc,
doc: "Skipped exons JC.txt output from RMATs containing only those calls with 10 or more junction spanning read counts of support"}
t1k_genotype_tsv: {type: 'File?', outputSource: t1k/genotype_tsv, doc: "Genotyping results from T1k" }
```

## Reference build notes:
Expand All @@ -306,4 +316,4 @@ groups"`. See the STAR documentation on `outSAMattrRGline` for complete details.
# [Kids First STAR Diploid Beta](docs/STAR_2.7.11b_DIPLOID.md)
This is an alternative alignment and quantification method currently in beta phase.
It uses a patient's DNA variant calls to create a "personal genome" for improved alignment.
See doc linked in section header.
See doc linked in section header.
3 changes: 3 additions & 0 deletions workflow/kfdrc_RNAseq_workflow.cwl
Original file line number Diff line number Diff line change
Expand Up @@ -705,6 +705,9 @@ steps:
gene_coordinates: hla_rna_gene_coords
preset:
valueFrom: "hla"
output_basename:
source: output_basename
valueFrom: $(self).t1k_hla
skip_post_analysis:
valueFrom: $(1 == 1)
out: [genotype_tsv]
Expand Down

0 comments on commit 68683a0

Please sign in to comment.