KK

Next Gen Sequencing Notes

High-Throughput Sequencing (HTS) Technologies
  • Definition: HTS technologies, commonly known as Next Generation Sequencing (NGS), allow for the rapid and simultaneous analysis of millions of DNA sequences. This technology has revolutionized genomics by providing comprehensive insights into the genetic makeup of organisms which can be applied in various fields including medicine, evolutionary biology, and biotechnology.

  • Common Platforms:

    • Illumina: This platform is widely used for DNA sequencing due to its high throughput, providing millions of reads in a single run, making it cost-effective for population-level studies and large-scale projects.

    • Ion Torrent: Utilizes semiconductor technology for sequencing, which allows for rapid turnaround times and lower costs, particularly beneficial for targeted sequencing applications.

    • Pacific Biosciences (PacBio): Known for its long-read sequencing capabilities, PacBio can sequence full-length genes and complex regions challenging for short-read technologies. Its applications are significant in structural variant detection and de novo genome assembly.

    • Oxford Nanopore Technologies: Offers real-time sequencing capabilities with portable devices like the MinION. This allows for quick field deployments and immediate analysis, particularly useful in outbreak investigations and environmental samples.

NGS Platforms Overview
  • NGS platforms can be categorized into:

    • 2nd Generation: These technologies primarily focus on short-read sequencing. They typically produce reads ranging from 50 bp to 300 bp, excelling in accuracy and throughput, ideal for applications such as whole-genome sequencing and targeted resequencing.

    • Examples: Illumina and Ion Torrent.

    • 3rd Generation: Designed for long-read sequencing, these platforms can generate reads that exceed 10 kb. They are crucial for resolving repetitive regions, structural variants, and accurately determining haplotypes.

    • Examples: PacBio and Oxford Nanopore.

  • Key Features of NGS:

    • Allows massive parallel sequencing of small DNA fragments, leading to high throughput and the capacity to generate large amounts of data efficiently.

    • Utilizes cutting-edge bioinformatics approaches to map reads against a reference genome, providing insights into genetic variations, which can have implications in health, disease, and evolution.

Comparison of NGS Platforms

Platform

Cost (USD)

Reads per run

Average Read Length

Error Rate (%)

Advantages

Disadvantages

Illumina HiSeq

4126/45.84

690,000

2 x 150 bp

1

Low cost per GB; high output; reliable results.

Errors due to short read length; alignment difficulties in complex genomes.

Ion Torrent

1000/20.41

224,000

175 bp

2

Low cost; speed of sequencing.

Higher error rates particularly in homopolymeric regions.

PacBio RS II

300 million

0.03 million

14,000 bp

10

Long reads facilitate complete transcript sequencing; lower bias in sequence representation.

High cost per run; generally lower throughput than other platforms.

Oxford Nanopore

900/1000

100

9000 bp

7

Portability; direct RNA and cDNA sequencing; real-time data acquisition.

Higher error rates; challenges in data analysis due to signal variability.

NGS Approaches
  • DNA Sequencing Methods:

    • By Synthesis: Such as Illumina sequencing, where nucleotides are sequentially added to a growing DNA strand, with each incorporation resulting in a specific fluorescent signal that is recorded.

    • Single Molecule Sequencing: This method enables the sequencing of long fragments without the necessity for amplification, reducing biases that can arise during PCR processes.

  • Deep Sequencing: This refers to sequencing with high coverage, which means a base is read several times. This not only augments detection of low-frequency variants but also enhances accuracy in SNP discovery and minimizes errors due to systematic biases.

The Illumina NGS Workflow
  1. Extraction: Isolate genomic DNA, following proper protocols to eliminate contaminants.

  2. Library Fragmentation: The DNA is broken into smaller, manageable fragments, preparing them for sequencing.

  3. Adapter Ligation: Specific DNA adapters that facilitate amplification and sequencing are ligated to both ends of the fragmented DNA.

  4. Cluster Amplification: The ligated fragments are then amplified in a flow cell, where millions of copies are generated to enhance the signal during sequencing.

  5. Sequencing: Employing the sequencing by synthesis technique, each cycle captures the incorporation of fluorophore-labeled nucleotides, building a sequence.

  6. Data Analysis: Using advanced bioinformatics pipelines, the resultant sequence data is aligned, analyzed, and interpreted to derive meaningful biological insights.

Sequencing by Synthesis
  • Specifically utilizes fluorescently labeled nucleotides in each reaction cycle, which enables the identification of the nucleotide as it is added to the growing DNA strand through emitted signals captured by high-resolution cameras.

Ion Torrent Sequencing Workflow
  • Library Preparation: DNA is fragmented and size-selected, followed by the ligation of adapters to prepare for sequencing.

  • Amplification: Emulsion PCR is utilized to amplify sequences captured on beads, leading to improved sequence quality and yield.

  • Signal Detection: This platform detects pH changes associated with nucleotide incorporation, translating that signal into sequence information without the need for fluorescence.

Nanopore Sequencing
  • Basic Principle: As single strands of DNA pass through nanopores, they create electrical changes that are unique to each nucleotide, enabling real-time analysis of long stretches of genetic material.

  • MinION Device: A revolutionary portable sequencer, capable of performing rapid DNA and RNA sequencing directly in the field, aiding in infectious disease monitoring and environmental assessments.

Importance of Sequencing Depth
  • Coverage Depth: Refers to the average number of reads per base in a sequenced genome. Higher coverage depth enhances the reliability and reproducibility of SNP and variant detection across diverse genomes.

  • Error Reduction: The approach assumes that sequencing errors are random; thus, with sufficient depth, consensus sequences can emerge that accurately reflect the true nucleotide composition of a sample.

Conclusion
  • Continuous advancements in NGS technologies and related methodologies are expanding our capabilities within genomics. These innovations enhance applications not only in medicine, such as personalized genomics and disease diagnosis but also in other fields like ecology and evolutionary biology. A robust understanding of these techniques is essential for scientists and professionals engaged in molecular biology and genetic research, as they play a pivotal role in the future of genetic science and healthcare advancements.