-
Notifications
You must be signed in to change notification settings - Fork 1
/
CHANGES
174 lines (127 loc) · 7.44 KB
/
CHANGES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
1.9 - ANOVA and excess variation tests in Shiny test app.
Loading counts.csv in R now generates an RDS file for faster reloading.
Options to clip tails at a certain length or require adaptor bases.
Options to clip reads where the density of high quality bases drops below a certain level. Gs are ignored in this, as two-color sequencing of Gs is dark on both channels and reads may end end with a run of high quality Gs actually representing running off the end of the fragment.
Tail length weight calibration now uses "well knotted" splines.
Option to do differential tails on detrended samples.
Color palette can be changed for tail distribution.
/1 stripped from read names during clipping, as STAR removes /1 from read names if present.
Option to change number of As to call a tail from default of 4.
Option to clip reads based on UMI and barcode (encoded in read name) rather than fixed adaptor sequence.
UMI counting (assuming UMI encoded in read name).
1.8 - End-shift test extra filtering to ensure no NA coefficients.
Fix bug in bigwig production when pysam present.
1.7 - R namespace bugfix.
1.6 - tidyr::unnest_ disappeared, use tidyr::unnest.
Shiny test app allow y limits to be set.
Add sample option min_match.
Bigwigs now omit introns (cover not span).
Some clean up of tidyverse code (_-suffix functions are now deprecated).
Updated install instructions.
weitrix based tests.
1.5 - Shiny app supports differential tail length and differential expression.
R code to produce matrix and weights for shift, tail length.
Option to produce peak pairs including non-utr peaks.
Reduced memory usage in call-peaks.
do_fragile option for pipeline, to avoid crashes on weird datasets.
1.4 - Use STAR aligner, only annotated introns are used. (bowtie2 still supported also.)
1.3 - Long standing bug in extend-sam-basespace did not extend over lower case letters in reference, fixed.
Allow up to 60% mismatch on A extension in extend-sam-basespace.
Filter out peaks with short average tail length.
Use --sensitive-local mode in bowtie2.
1.2 - Allow raw poly(A) lengthh and templated width download from mPAT shiny report.
Fix bug setting three prime bounds for cumulative read distribution plots (missing dplyr:: for lead and lag)
1.1 - Remove deduplication.
poly(A) calling in reads no longer uses quality information,
a sequence of 10 "A"s and/or adaptor sequence is called as poly(A) tail.
0.43 - Clip Illumina reads to a region containing 90% bases with quality at
least 20. Poly(A) tails are only called within this region, this
means we see less long poly(A) tails, so this changes the tail length statistics
quite a lot.
Add shiny reports, end shift test in R.
Bugfixes to plot-comparison, plot-pooled.
make-ensembl-reference skips genes with transcripts on inconsistent strands.
Control of clipping in pipeline.
peak-polya option in pipeline.
0.42 - When extended, don't extend over another exon on the same strand
(previously: another CDS on the same strand).
Another excess memory usage bug removed from aggregate-tail-counts.
0.41 - Add Ensembl genome downloader.
Read counting can be restricted to exons of genes.
Relation between peaks and genes includes more informative information.
Add "primer-gff" tool.
Use Varistran for moderated log transformation.
analyse-polya-sample deletes some unneeded files after running.
Re-use pickles for dedup counts.
call-utrs and compare-peaks use output of relate-peaks-to-genes when determining which are in 3'UTR
0.40 - Hack to reduce parallelism on tail_count, reduce memory usage.
Fix very silly overuse of memory.
Other memory improvements in aggregate-tail-counts.
0.39 - Bugfix with groups.
More documentation of pipeline output.
Include design matrix in report from "test:".
0.38 - Allow more parallelism for tail-count, memory usage doesn't seem to be severe.
Check for duplicate sample name or duplicate read filename in "analyse-polya-batch:".
0.37 - Use Fitnoise 2 for testing.
0.36 - Prevent aggregate-tail-counts from running in parallel in pipeline.
Call primary peaks as part of pipeline.
Output a JSON file for Andrew and Jack's survival curver plotter.
0.35 - Options for sample weighting and empirical controls in "test:".
0.34 - .json files accidentally omitted from report.
0.33 - Add average expression / average tail length column to "test:" degusts.
0.32 - Add "AD" property to BAM files, number of adaptor sequence bases observed.
Don't put a lower limit on number of reads required to call proportion, tail.
Improved tail length testing.
Dropped Chi-Sq testing from "compare-peaks:", "test:" hopefully makes this obsolete.
"compare-peaks:" outputs "NA"s rather than "None"s in peak-pair file.
Cleaned up reports.
0.31 - Remove collapse-features from call-peaks, not needed and causes later stage to fail.
0.30 - Bugfix in report generation for "analyse-polya-batch:",
correctly link to genewise-dedup/peakwise-dedup
0.29 - Bugfix to "make-tt-reference:".
0.28 - Bugfix to "make-tt-reference:".
0.27 - Peaks were being incorrectly extended in pipeline, fixed.
Added facility to perform a set of tests to analyse-polya-batch.
"compare-peaks:" option to output longer pre-peak and inter-peak sequences.
0.26 - Show p.value in "test:".
0.25 - Bugfix in "test:" tool.
0.24 - Add "test:" tool.
0.23 - Add --peak-min-depth option to workflow.
Default raised from 10 to 50.
compare-peaks output table lists relevant peaks.
Gene viewer shows relevant peaks.
Gene viewer shows 3' UTR position.
Don't nuke pooled.csv.
0.22 - "compare-peaks:" outputs utr before first peak as well as inter-peak sequence.
Fix bugs in workflow.
0.21 - Major refactoring
- define reference annotation to be GFF3
- UCSC genome downloader
0.20 - "nodedup" -> "dup"
Changes to reports.
0.19 - Fix setup.py
0.18 - Basespace read support.
"call-peaks:" pipeline.
"compare-peaks:" tool.
"geneview-webapp:" gene expression viewer.
0.17 - Assignment of reads to features made more deterministic in "tail-lengths:"
0.16 - Thread tail statistics through outputs.
"tail_length:" now only ever counts reads once,
uses only the feature with the greatest overlap of a read
rather than all features with overlaps.
"compare-peaks:" now ranks by "interestingness".
0.15 - Added "compare-peaks:" tool.
0.14 - Extend_sam now requires 4 bases of tail rather than 3 to be marked AA,
to be consistent with tail length tools (and observation that this is a better cutoff).
0.13 - Don't call file igv-plots.tar.gz.tar.gz,
a single .tar.gz is sufficient.
0.12 - Bugfixes
0.11 - Update to use newer nesoni features, such as sample tags
0.10 - use distribute
0.9 - R+ doesn't like trailing newlines in text in plots sometimes, apparently
0.8 - use report from correct log files when using consensus=False
0.7 - don't re-run everything if number of cores changes
0.6 - analysis is now performed both with and without deduplication
0.5 - add "tail-stats:" tool
extend_sam: fix out by one error for AA:i: attribute on reverse strand
0.4 - plot-pooled now produces a spreadsheet