ASE for Synonymous + Non-synonymous Variants #93

JPFinnigan · 2018-03-09T19:34:12Z

The current implementation outputs VAR and REF read counts for non-synonymous variants only. I would be great, as a user to have the option to output read-support counts for all variants. I've used Varlens to get around this current limitation in Isovar, but that route has it's own limitations which I'll discuss below.

Per a conversation w/ Alex:

Hey John,

I looked a little bit and found that on line 67 of isovar.effect_prediction I'm doing the following:

nonsynonymous_coding_effects = effects.drop_silent_and_noncoding()

Do you want me to make this optional for the purposes of counting variant reads and assembling variant sequences?

If so, can you file an issue on the repo? https://github.com/openvax/isovar/issues

Eliminating the hard filter for non-synonymous variants affords the user a bit of added flexibility, but would necessitate additional descriptors for each variant to enable filtering to variant classes of interest. I think two additional columns, "Effect_Class" and "Effect" would solve the filtering problem and make working with the isovar output relatively easy.

I believe two columns may be required largely because of my experience working with Varlens. The Varlens output has an "effect" column that describes the specific coding effect of a variant (e.g. p.G12D). However, I've found this to be difficult to work-with in practice as AFAIK there is no easy way to parse non-synonymous SNVs ("p.G12D"), in-frame INDEL ("p.HDVPS811del") and framshifts (p.A117fs). It may be better to have separate columns for effect class ("Exon, non-synonymous") separated from the descriptor of the specific effect (p.G12D).

Ideally an effect class column would provide the same filtering as the current hard-coded isovar filters, or use the standard Ensembl classes.

3' UTR
5' UTR
exonic-splice-site
Incomplete
Intergenic
Intragenic
Intronic
intronic-splice-site
non-coding-transcript
Silent
splice-acceptor
Splice-donor
Stop-loss
Stop-gain
Exon, Non-synonymous

The specific use case I have in mind is counting the number of variants, the number of variants with RNA read-support; and finally how the latter category breaks down by variant type (e.g. SNV, SNV w/ coding effect, Indel, etc).

iskandr · 2018-03-12T16:56:28Z

Hey @JPFinnigan,

This could work how you'd like with very few changes. Do you, by any chance, have a test dataset of a few variants and their supporting RNA reads, along with expected counts and annotations? If not, I can make that myself but it would speed things up a little bit.

JPFinnigan · 2018-03-14T03:40:42Z

Hey @iskandr , that's shouldn't be a problem. I'll send the materials to you tomorrow morning.

Take care! And thank you for working on this

iskandr · 2018-05-01T21:20:22Z

Hey @JPFinnigan -- sorry it took me a long time to get to this, starting to look at it now.

iskandr mentioned this issue May 16, 2019

Major rewrite to enable use of newer pysam, fix several bugs, and add filtering options #104

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASE for Synonymous + Non-synonymous Variants #93

ASE for Synonymous + Non-synonymous Variants #93

JPFinnigan commented Mar 9, 2018

iskandr commented Mar 12, 2018

JPFinnigan commented Mar 14, 2018

iskandr commented May 1, 2018

ASE for Synonymous + Non-synonymous Variants #93

ASE for Synonymous + Non-synonymous Variants #93

Comments

JPFinnigan commented Mar 9, 2018

iskandr commented Mar 12, 2018

JPFinnigan commented Mar 14, 2018

iskandr commented May 1, 2018