Effects of 454 Sequencing Error Correction on HIV-1 Diversity
Transcript of Effects of 454 Sequencing Error Correction on HIV-1 Diversity
Effects of 454 Sequencing Error Correction on HIV-1
Diversity.
M. Cristina Rodríguez1, Maria Casadellà1, Christian Pou1, Eloisa Yuste3, Víctor Sánchez-Merino3, Bonaventura Clotet1,2, Roger Paredes 1,2, Marc Noguera-Julian1
1Institut de Recerca de la SIDA IrsiCaixa-HIVACAT, 2Unitat VIH, Hospital Universitari Germans Trias i Pujol, Badalona, Spain . 3Hospital Clinic Barcelona, Spain.
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
Several different bioinformatic tools are available to correct sequencing errors obtained from 454 output, through use of clustering based algorithms: Amplicon Variant Analyzer (AVA) software (454 Life Sciences) and AmpliconNoise produce a set of consensus reads. Shorah yields a set of individually corrected reads.
Background 454 sequencing (454 life sciences/Roche Diagnostics) can be used to detect low frequency variants within HIV-1 viral population. 454 sensitivity is compromised by sequencing error.
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
Materials & Methods l We generated a low-diversity viral mix by using the full-length env gene from HIV-1 strain AC10 cloned into pNL43 context and generating a library of randomly mutated AC10 envelope by a PCR-based method. l We used three amplicons: amp_1, amp_3 and amp_4; amplicons 3 and 4 are overlapped between the 1595 to 1620 positions.
l Single amplicon (425bp) sequences were obtained by 454 sequencing on single-strain AC10 and mutated AC10 (AC10-Mut) sequence data.
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
Amplicon Design
Amp_1
Amp_3
Amp_4
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
SFF
Contamination Filter (pNL4.3)
DeMult. & TRIM
AMPLICONNOISE
AVA SOFTWARE
VARIANTS
SFF
IN-HOUSE SCRIPTS
FASTA
ALIGNMENT (Mosaik, BWA)
CLEAN(FF)
CLEAN(SEQ)
CLEAN(SEQ)
FILTERING (CHREM) • Homology ≥ 80% relaFve to AC10 • Read length(+/-‐ 15%) over Reference
SHORAH
OUT
454
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
l Only substitution variants were considered.
l The mutation rate for non processed data wasn’t larger than 3%. l The frequency of all non-reference nucleotide variants was measured on a per-position basis. l The percentage of signal reduction for raw signal (<0.5, <1.0, <1.5, <2, >= 2) was divided into five categories.
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
RESULTS
AC10 Raw data
00,5
11,5
22,5
33,5
44,5
5
254 322 390 458 526 594 662 1327 1395 1463 1531 1599 1667 1735 1803 1871 1939
Ampl_1 Ampli_3_4
Perc
enta
ge
Nucleotide variant frequency from AC10 raw sequence data.
Random Mutagenesis
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
AC10_mut Raw data
00,5
11,5
22,5
33,5
44,5
5
254 322 390 458 526 594 662 1327 1395 1463 1531 1599 1667 1735 1803 1871 1939
Ampl_1 Ampl_3_4
Perc
enta
geNucleotide variant frequency corresponding to raw sequence
data was higher in AC10-Mut than in AC10.
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
When AC10 data was treated, error was efficiently removed by AVA, Shorah, AmpliconNoise(S) and AmpliconNoise(F).
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
Signal reduction plots for each raw signal category
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
Conclusions l While data cleaning methods are needed to efficiently correct for sequencing error, the application of these tools on real sequence data may also result in the loss of true diversity. l Low diversity positive controls would help fine-tuning error correction tools for selectively removing sequencing error over real low-level true signal.
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
Acknowledgements
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain
Molecular Epidemiology group, Irsicaixa.
Presented at the 10th EU. Meeting on HIV & Hepatitis, 28 - 30 March 2012, Barcelona, Spain