Next Generation Sequencing

advertisement
Next Generation Sequencing
Technology and applications
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
1
Landmarks in DNA sequencing
• 1953 Discovery of DNA double helix structure • 1977 – A Maxam and W Gilbert "DNA seq by chemical degradation" – F Sanger"DNA sequencing with chain-­‐terminating inhibitors" • 1984 DNA sequence of the Epstein-­‐Barr virus, 170 kb • 1987 Applied Biosystems -­‐ first automated sequencer • 1991 Sequencing of human genome in Venter's lab • 1996 P. Nyrén and M Ronaghi -­‐ pyrosequencing • 2001 A draft sequence of the human genome • 2003 human genome completed • 2004 454 Life Sciences markets first NGS machine
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ UZ Leuven -­‐ KU Leuven
Massive parallel sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ UZ Leuven-­‐KU Leuven
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ UZ Leuven-­‐KU Leuven
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ UZ Leuven -­‐ KU Leuven
Landmarks in NGS
Roche 454
Solexa/Illumina
E. coli (5Mb)
SOLiD
Arabidopsis thaliana (157 Mb)
200 K reads 120 bp
30M reads 35 bp
100M reads 35 bp
2005
2006
2007
6
Landmarks in NGS
Roche 454
Illumina
SOLiD
Ion torrent
PacBio RS
E. coli (5Mb)
Arabidopsis thaliana (157 Mb)
200 K reads 30M reads 100M reads 120 bp
35 bp
35 bp
2005
2006
2007
2008
2009
2010
7
DNA Sequencing – the next generation
NGS refers to non-­‐Sanger-­‐based high-­‐throughput DNA sequencing technologies. Millions or billions of DNA strands can be sequenced in parallel
DNA Sequencing – the next generation
• NGS refers to non-­‐Sanger-­‐based high-­‐
throughput DNA sequencing technologies. • NGS technologies constitute various strategies that rely on a combination of – Library/template preparation – Parallel sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ UZ Leuven-­‐KU Leuven
DNA Sequencing – the next generation
Sample prep
10/1/2015
Clonal Amplification
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
Parallel sequencing
11
Roche GS FLX 454 & Roche Junior
454 SEQUENCING
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
12
454 sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
13
454 sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
14
454 sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
15
Life Technologies SOLiD 5500 Genetic Analyzer
SOLID SEQUENCING
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
16
SOLiD sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
17
SOLiD sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
18
Life Technologies: Ion Proton & Ion PGM
ION TORRENT SEQUENCING
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
19
Ion Torrent Sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
20
Ion Torrent Sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
21
Illumina HiSeq & NextSeq & MiSeq
ILLUMINA (SOLEXA) SEQUENCING
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
22
Illumina sequencing Library
• All sample preparation protocols regardless of the application end with the same product: – Double-­‐stranded DNA with the insert to be sequenced flanked by adapters 10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
23
Illumina library prep
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
24
Illumina Sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
25
Illumina Sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
26
Illumina Sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
27
Helicos BioSciences: November 15, 2012, bankrupt
HELISCOPE SEQUENCING
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
28
DNA Sequencing – the next generation
Sample prep
10/1/2015
Clonal Amplification
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
Parallel sequencing
29
Heliscope sequencing
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
30
Oxford Nanopore Technologies: GridION & MinION
NANOPORE SEQUENCING
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
31
Oxford Nanopore Technologies: GridION & MiION
NANOPORE SEQUENCING
Pacific Biosciences PacBio RS II
SMRT SEQUENCING
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
33
PacBio history
• 2010 -­‐ PacBio seduced investors with a promise of technology revolution – A whole human genomes for $100 – in about 15 minutes • 2011 -­‐ GC applies for funding for third generation sequencer
PacBio history
• 2012 -­‐ None of those predictions came true – Few scientists bought the one-­‐ton instrument. – PacBio • market valuation of less than $70 million • technology value of $0. • $600 million of cash down the toilet. • 2012 – GC gets funding for PacBio! • Oxford Nanopore announced at AGBT
PacBio history
• 2012 – New CEO Mike Hunkapiller @ PacBio • 2013 – GC installs PacBio – PacBio improved and has a niche • ability to detect structural genetic variations • creating high-­‐quality genomes of small organisms like bacteria, viruses, and worms. – PacBio’s deal with Roche to develop technology for the diagnostic market
Single Molecule, Real-Time (SMRT®) DNA Sequencing
SMRT® bell
SMRT® Cells
PacBio® RS II
Template Preparation
Template
Template Preparation
Preparation
Run
Run Design
Design
Polymerase
Polymeras
eBinding
Binding
Instrument
Instrument
Run
Run
Primary
Primary
Analysis
Analysis
Secondary
Secondary
Analysis
Analysis
DNA Sample
Fragment DNA
Damage Repair/
End Repair
Ligate adapters
Purify DNA
SMRTbell™ Template preparation can be used to create
libraries of various insert sizes from 250 bp to 20,000 bp
depending on the needs of the application.
Tertiary
Tertiary
Analysis
Analysis
Advantages of SMRTbell™ Templates
Key Advantages:
• Structurally linear
• Topologically circular
• Provides sequences of both forward and
reverse strands in the same trace
Base Modification: Discover the Epigenome
Directly observe base modifications using the kinetics of the polymerization reaction
during normal sequencing
Signal Processing and Base Calling
Converting pulses of light into DNA bases
and kinetic measures
43
Understanding Accuracy in SMRT® Sequencing
• Single-pass error rate ~11% (predominantly deletions or
insertions)
• Single Molecule, Real-Time (SMRT®) DNA sequencing achieves
highly accurate sequencing results, exceeding 99.999% (Q50)
• How is this possible given that single-pass sequence has 1
mistake every 10 nucleotides
• Single-pass errors are distributed randomly, which means that
they wash out very rapidly upon building consensus.
Sequencing
45
74
SMRT® Sequencing Accuracy
Perspective: Understanding SMRT Sequencing Accuracy
Data generated with P4-C2 chemistry on PacBio® RS II;
Analyzed using Quiver with 2.0.1 SMRT® Analysis
The PacBio® RS Helps Resolve Genetically Complex Problems
Targeted
Comprehensively
Sequencing
Characterize
Genomic Variation
Generate
Finished
De
Novo Assembly
Assemblies
Base Modification
Automatically
detect
Detection
DNA base
modifications
47
NGS time line
Roche 454
Illumina
SOLiD
Ion torrent
PacBio RS
E. coli (5Mb)
Arabidopsis thaliana (157 Mb)
200 K reads 30M reads 100M reads 120 bp
35 bp
35 bp
2005
2006
2007
2008
2009
2010
2011
49
454
Mb)
NGS time line
Illumina
SOLiD
Ion torrent
PacBio RS
Arabidopsis thaliana (157 Mb)
ads 30M reads 100M reads p
35 bp
35 bp
2006
2007
2008
2009
2010
2011
50
2012
09
NGS time line
Ion torrent
HiSeq 4000
PacBio RS
HiSeq X ten
HiSeq2500
2010
2011
51
2012
PB Sequel
2013
2014
2015
2016
NGS Technology: conclusions
52
NGS Technology: conclusions
53
Summary
54
NGS terminology
55
NGS as a tool for studying Genome variation and regulation
NGS APPLICATIONS
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
56
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ KU Leuven -­‐ UZ Leuven
57
DNA SEQUENCING
WHOLE GENOME SEQUENCING
59
Copy Number Variations
60
Structural Variations
61
Whole genome sequencing
ì Copy number variation analysis ì Sequencing a genome at 0.1-­‐0.3x ì Sequencing a genome at 1-­‐3x ì Structural variation analysis ì Sequencing a genome at 5-­‐10x ì Whole genome re-­‐sequencing ì Sequencing a genome at >30x ì yeast, fruit fly, bacterial genomes, human…
62
DNA SEQUENCING
TARGETED RE-­‐SEQUENCING
Sequencing -­‐ the beginning
Random ???
genome sequencing
10/1/2015
???
Jeroen Van Houdt -­‐ Genomics Core -­‐ UZ Leuven-­‐KU Leuven
Sanger sequencing • Targeted • 700-­‐100
0 bp
Target enrichment strategies
Random Hybrid genome Capture
sequencing
10/1/2015
PCR based Sanger sequencing
Jeroen Van Houdt -­‐ Genomics Core -­‐ UZ Leuven-­‐KU Leuven
Target enrichment strategies
10/1/2015
Jeroen Van Houdt -­‐ Genomics Core -­‐ UZ Leuven-­‐KU Leuven
67
Rapid expression profiling, transcriptome sequencing and small RNA’s
RNA SEQUENCING
RNA-­‐seq
RNAseq: Gene Expression through sequencing
ì
Supports discovery, screening, and profiling ì
Does not require prior gene knowledge or annotation ì
Unique combination of Qualitative and quantitative measurement ì
Digital counts vs analog intensities ì
Increased dynamic range and sensitivity ì
No probes or primers ì
Any species -­‐ Even when reference genome not available ì
Analyze gene expression
RNAseq: summary
ì
Counting or Profiling ì
ì
Studying Alternative Splicing or quantifying cSNPs for most transcripts ì
ì
10 million total reads of 35 bp length from poly-­‐A selected RNA will give performance better than any microarray Deeper profiling of 50 to 100 million reads, with read lengths of 50 to 100 bps, from poly-­‐A selected RNA using mRNA-­‐Seq assay Complete Annotation of an entirely New Transcriptome ì
ì
ì
~500 Million reads of 100 bp read length from multiple tissues Normalized stranded mRNA-­‐Seq & ncRNAs Small RNA-­‐Seq for microRNAs
Download