F

Next Generation Sequencing

What to Sequence and Why? Structure → Function

  • De novo whole genome sequencing: requires de novo whole genome assembly

  • Polymorphism discovery (distinct from genotyping!): Targeted approaches, Whole genome, SNPs, copy number variations, insertions, deletions, etc.

  • Expressed sequence discovery: ESTs, cDNAs, miRNAs, etc

  • Functional genomics: ChIP, Expression profiling, Nucleosome positioning

In the original Sanger sequencing every ddNTP is marked with a fluorescent molecule and prepared in a different flask. This is great but… Wouldn’t it be great to run everything in one lane? Save space and time, more efficient. Fluorescently label the ddNTPs so that they each appear a different color.

Next Generation Sequencing (NGS) is a modern high-throughput DNA sequencing technology. It is parallel and rapid, undergone with decreasing price, time, workflow complexity, error rate. Increasing data quantity and quality, read length (data storage capacity), repertoire of bioinformatics tools and has a wide range of applications. Third Generation Sequencing (single molecule, real time, in situ ...). Starting material:

  • DNA (DNA-seq)

  • RNA (RNA-seq)

  • DNA fragments bound to selected protein – to analyse the sequences of DNA-binding sites of protein of interest or localisation of histone modifications (ChIP-seq)

Worflow

  1. Library preparation. With an emulsion PCR the fragments are attached to beads on which sequences complementary to the adapters are found, the bead is placed in a droplet of water which contains all the reagents necessary for the PCR and the reaction occurs. The final result will be a bead with many identical fragments which will be inserted into the flow cell for sequencing (Roche).

  2. Sequencing Principles by Synthesis like

    1. Sanger/Dideoxy chain termination (Life Technologies, Applied Biosystems),

    2. Pyrosequencing (Roche/454), It is a method based on the assay of the pyrophosphate released following the attack of a dNTP to the polymerized filament.

      1. The primer is hybridized to the single helix mold, amplified by PCR and is incubated with DNA polymerase enzymes, ATP sulfasilase, luciferase and apirase, substrates (dNTP), adenosine 5 'phosphosulphate (APS) and luciferin.

      2. One of the four dNTPs is added to the reaction. DNA polymerase catalyzes the addition of such a base only if it is complementary to the template residue. In this case there is a concomitant release of inorganic pyrophosphate PPi in an equimolar amount to that of the incorporated nucleotide.

      3. The PPi thus produced is transformed into ATP by sulfurylase using ASP as a substrate. The obtained ATP allows the conversion of luciferin to oxiluciferin by luciferase with the production of a light signal, of intensity proportional to the amount of ATP, which is detected by a specific photosensitive camera (CCD).

      4. The apyrase enzyme degrades the dNTP that has not been incorporated, and the excess ATP. Only when the degradation is finished, a second dNTP is added to advance the polymerization reaction (returning to step 1).

      5. All 4 d(NTP) are cyclically added until the complete deduction of the sequence. The dNTPs are added sequentially, one at a time. Since dATP is a natural substrate of luciferase, deoxyadenosine-thio-triphosphate (dATPS) is used instead, which is efficiently used by DNA polymerase but is not recognized by luciferase. As the process continues, the complementary DNA strand is synthesized and the nucleotide sequence is determined by pyrogram peaks. The light signal produced each time by luciferin is recorded in a special "pyrogram". The signal will be proportional to the produced ATP and then to the incorporated nucleotide; a peak of double intensity, for example, detects that in the same cycle 2 dNTPs have been incorporated (repetition of the same base on the template). Conversely, a null signal indicates that the dNTP added in that cycle is not complementary. Note that the ATP can not be used as dNTP to be introduced for polymerization, otherwise it would not be possible to understand if the detected signal comes from a correct incorporation of the nucleotide or from the intrinsic activity of the ATP. Alternatively, adenosine-thio-triphosphate is used, which is recognized by DNA polymerase as if it were ATP, but not by luciferase.

      c. Reversible terminator (Illumina),

      d. Ion torrent (Life Technologies),

    3. Zero Mode Waveguide (Pacific Biosciences) 3rd generation sequencing

    Sequencing by Oligo Ligation Detection like

    f. SOLiD (Applied Biosystems). Direct reading of DNA sequence

    g. Nanopore sequencing 3rd generation sequencing

III generation NGS

Single-molecule real-time (SMRT) sequencing from Pacific Biosciences (PacBio). Template fragments are processed and ligated to hairpin adapters at each end, resulting in a circular DNA molecule with constant single-stranded DNA (ssDNA) regions at each end with the double-stranded DNA (dsDNA) template in the middle. The resulting SMRTbell' template undergoes a size-selection protocol in which fragments that are too large or too small are removed to ensure efficient sequencing. Primers and an efficient 429 DNA polymerase are attached to the ssDNA regions of the SMRTbell. The prepared library is then added to the zero-mode waveguide (ZMW) SMRT cell, where sequencing can take place. To visualize sequencing, a mixture of labelled nucleotides is added; as the polymerase-bound DNA library sits in one of the wells in the SMRT cell, the polymerase incorporates a fluorophore-labelled nucleotide into an elongating DNA strand. During incorporation, the nucleotide momentarily pauses through the activity of the polymerase at the bottom of the ZMW, which is being monitored by a camera.

“The MinION has been used to successfully read the genome of a lambda bacteriophage, which has 48,500-ish base pairs, twice during one pass. That's impressive, because reading 100,000 base pairs during a single DNA capture has never been managed before using traditional sequencing techniques. The operational life of the MinION is only about six hours, but during that time it can read more than 150 million base pairs. That's somewhat short of the larger human chromosomes (which contain up to 250 million base pairs), but Oxford Nanopore has also introduced GridION -- a platform where multiple cartridges can be clustered together. The company reckon that a 20-node GridION setup can sequence a complete human genome in just 15 minutes.”