Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study...

35
Exact Amplicon Sequence Variants BIOME 2018

Transcript of Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study...

Page 1: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Exact Amplicon Sequence Variants

BIOME 2018

Page 2: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Diversity of Life

Image: Norman Pace

Page 3: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

A Molecular CensusMetabarcoding or Marker-gene Sequencing

Page 4: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

A Microbial Census

ATTAACGAGATTATAACCAGAGTACGAATACCGAACCACGATTCACAAGGTACCACAAGGTAACATAGCTCCATTAACCCCTTATAACCAGAGTACGAATACCGAACAATTAACGAGATTATAACCAGAGAGAGAATACCGAACCACGATTCTTGTGGTACCACAAGGTAACATAGCTCCCACGATTCACAAGGTACCACAAGGTAACATAGCTCCGGGAACTACAATCTCTAAGGTGAAGTCTCAGTCTATATTAACGAGATTATAACCAGAGTACGAATACCGAACCACGATTCACAAGGTACCACAAGGTAACATAGCTCCATTAACGAGATTATAACCAGAGTACGAATACCGAAC

Metabarcoding or Marker-gene Sequencing

Page 5: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

A Microbial Census

ATTAACGAGATTATAACCAGAGTACGAATACCGAACCACGATTCACAAGGTACCACAAGGTAACATAGCTCCATTAACCCCTTATAACCAGAGTACGAATACCGAACAATTAACGAGATTATAACCAGAGAGAGAATACCGAACCACGATTCTTGTGGTACCACAAGGTAACATAGCTCCCACGATTCACAAGGTACCACAAGGTAACATAGCTCCGGGAACTACAATCTCTAAGGTGAAGTCTCAGTCTATATTAACGAGATTATAACCAGAGTACGAATACCGAACCACGATTCACAAGGTACCACAAGGTAACATAGCTCCATTAACGAGATTATAACCAGAGTACGAATACCGAAC

Lactobacillus crispatus 1300 5 0 882 596

Ureaplasma urealytica 15 0 220 0 0

Gardnerella vaginalis 22 0 1 0 412

Prevotella intermedia 0 0 8 12 0

... ... ... ... ... ...

Metabarcoding or Marker-gene Sequencing

Page 6: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

A Microbial Census

ATTAACGAGATTATAACCAGAGTACGAATACCGAACCACGATTCACAAGGTACCACAAGGTAACATAGCTCCATTAACCCCTTATAACCAGAGTACGAATACCGAACAATTAACGAGATTATAACCAGAGAGAGAATACCGAACCACGATTCTTGTGGTACCACAAGGTAACATAGCTCCCACGATTCACAAGGTACCACAAGGTAACATAGCTCCGGGAACTACAATCTCTAAGGTGAAGTCTCAGTCTATATTAACGAGATTATAACCAGAGTACGAATACCGAACCACGATTCACAAGGTACCACAAGGTAACATAGCTCCATTAACGAGATTATAACCAGAGTACGAATACCGAAC

Lactobacillus crispatus 1300 5 0 882 596

Ureaplasma urealytica 15 0 220 0 0

Gardnerella vaginalis 22 0 1 0 412

Prevotella intermedia 0 0 8 12 0

... ... ... ... ... ...

Inference

ExplorationVisualization

Metabarcoding or Marker-gene Sequencing

Page 7: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

The MGS Microscope

Page 8: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

The MGS Microscope

• sample collection • sample storage • DNA extraction • library preparation • PCR • sequencing instrument • bioinformatics

Page 9: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Marker-gene Sequencing2,000,000 bp300 bp

Page 10: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

2,000,000 bp300 bp

1

5

10

15

12,000,005

12,000,010

Marker-gene Sequencing

Page 11: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

2,000,000 bp300 bp

Thousands to Billions of “Barcodes”

1

5

10

15

12,000,005

12,000,010

Marker-gene Sequencing

Page 12: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Callahan, et al. Nature Methods, 2016.

PCR/sequencing

SampleSequences

AmpliconReads

Page 13: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Callahan, et al. Nature Methods, 2016.

PCR/sequencing

SampleSequences

AmpliconReads

Pick OTUs

OperationalTaxonomic Units

Page 14: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Callahan, et al. Nature Methods, 2016.

PCR/sequencing

SampleSequences

AmpliconReads

Page 15: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

PCR/sequencing

SampleSequences

AmpliconReads

Callahan, et al. Nature Methods, 2016.

Page 16: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

PCR/sequencing

SampleSequences

AmpliconReads

Callahan, et al. Nature Methods, 2016.

Pick OTUs

Uh oh!

Different OperationalTaxonomic Units

Page 17: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

OTU85 is not a consistent label

Exact Sequence Variants…

OTU85 is predictive of a disease? Not in future data!

OTU85 is associated w/ X and Y? Can’t be tested!

OTU85 is in this community? OTUs don’t exist in nature!

McLaren & Callahan, mBio, 2018.

Page 18: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Exact Sequence Variants…

Page 19: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Different OperationalTaxonomic Units

PCR/sequencing

SampleSequences

AmpliconReads

Callahan, et al. Nature Methods, 2016.

Pick OTUs

Page 20: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Different OperationalTaxonomic Units

PCR/sequencing

SampleSequences

AmpliconReads

Callahan, et al. Nature Methods, 2016.

Pick OTUs

Infer Sample Sequences

Page 21: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Exact Sequence Variants…

OTU85 is not a consistent label, but…

ATTAACGAGATTATAACCAGAGTACGAATA…

is consistent!

Page 22: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Sequence Tables

Study 1

Study 2

Study 3

Cross-study comparisonmerge

Eliminates need for joint reprocessing of raw data.

Callahan, et al. ISMEJ, 2017.

…the sequence is the label

Page 23: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Sequence Tables

Study 1

Study 2

Study 3

Cross-study comparisonmerge

Eliminates need for joint reprocessing of raw data.Continuous data integration. Unlimited dataset size.

Callahan, et al. ISMEJ, 2017.

…the sequence is the label

Page 24: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Sequence Tables

Study 1

Study 2

Study 3

Cross-study comparisonmerge

Eliminates need for joint reprocessing of raw data.Continuous data integration. Unlimited dataset size.You in 2020 can work directly with you today.

Callahan, et al. ISMEJ, 2017.

…the sequence is the label

Page 25: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

…the sequence is the labelD

ata

Your dataset

Biology

In re

fere

nce

data

base

De Novo Closed-Reference Exact SVs

Valid

Invalid

Callahan, et al. ISMEJ, 2017.

“Replacing OTUs with ASVs makes marker-gene sequencing more precise, reusable, reproducible and comprehensive.”

Page 26: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

A rose by another name…

Exact Sequence Variants (ESVs) - Callahan et al. 2017 (by accident)

Amplicon Sequence Variants (ASVs) - Needham et al. 2017 - Callahan et al. 2017

sub-OTUs (sOTUs) - Amir et al. 2017

Zero radius OTUs (zOTUs) - Edgar 2017

Haplotypes, oligotypes, …

Page 27: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

A rose by another name…

Exact Sequence Variants (ESVs) - Callahan et al. 2017 (by accident)

Amplicon Sequence Variants (ASVs) - Needham et al. 2017 - Callahan et al. 2017

sub-OTUs (sOTUs) - Amir et al. 2017

Zero radius OTUs (zOTUs) - Edgar 2017

All the same thing! All the same (qualitative) benefits!

Haplotypes, oligotypes, …

Page 28: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Exact Sequence Variants…

PreciseTractableReproducibleComprehensiveCallahan, et al. ISMEJ, 2017.

Page 29: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Exact Sequence Variants…

PreciseTractableReproducibleComprehensiveCallahan, et al. ISMEJ, 2017.

Page 30: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

… are replacing OTUs

PreciseTractableReproducibleComprehensive

“the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale…for many sample types, especially plant-associated and free-living communities, one-third of reads or more could not be mapped to [closed-reference OTUs]…Because exact sequences are stable identifiers, unlike OTUs, they can be compared to any 16S rRNA or genomic database now and in the future, thereby promoting reusability [22]… An advantage of using exact sequences is that they enable us to observe and analyse microbial distribution patterns at finer resolution than is possible with traditional OTUs…”

Callahan, et al. ISMEJ, 2017.

Page 31: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Preterm birth is the leading cause of child mortality in the US and worldwide.

Intrauterine infection is the most common mechanism, estimated to occur in 25-40% of cases. This may be an underestimate.

Bacterial vaginosis is a known risk factor (2-4x). Perhaps periodontitis as well.

Potential routes of intrauterine infection

Ascendingroute, nativemicrobiota

Resolution: Preterm Birth

Goldberg, et al. Lancet, 2006.

Page 32: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Resolution: Preterm Birth

Gardnerella

0.01

1.00

0.01 1.00

P−Value (Stanford)

P−V

alu

e (

UA

B)

Resolution

Genus/OTU

Sequence Variant

Frequency

0.1%

1%

10%

Callahan et al. PNAS, 2017.

Page 33: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Resolution: Preterm Birth

G1

G2

G3

Gardnerella

0.01

1.00

0.01 1.00

P−Value (Stanford)

P−V

alu

e (

UA

B)

Resolution

Genus/OTU

Sequence Variant

Frequency

0.1%

1%

10%

Callahan et al. PNAS, 2017.

Page 34: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

Resolution: Preterm Birth

00703C2mash

00703Bmash

5515241V 1400E

HMP9231

0288E

14019

75712

315−A

101

00703Dmash6119V5409−055−1

AMD

6420B

G1

G2

G3

Tree from whole genomes

Callahan et al. PNAS, 2017.

Page 35: Exact Amplicon Sequence Variants - QCBS · 2018. 12. 31. · Sequence Tables Study 1 Study 2 Study 3 Cross-study comparison e Eliminates need for joint reprocessing of raw data. Continuous

=? Reproducible Measurements