Next Generation Sequencing

Learning Objectives

Describe the basis of the term “Next Generation Sequencing” with respect to Sanger sequencing.
Describe the core principles of the Illumina NGS process, including detailed knowledge of the four-step process
- Library construction
- Cluster generation
- Sequencing by synthesis
- Data Analysis
Describe in detail the principles of some applications of NGS, including Whole Exome Sequencing (WES) and RNA-seq

Basis of "Next Generation Sequencing" (NGS) Compared to Sanger Sequencing

NGS represents a massively parallel sequencing technology, offering higher throughput and speed than traditional Sanger sequencing. Here’s how they compare:

Throughput:
- Sanger sequencing reads one DNA sequence at a time (1 reaction = 1 sequence).
- NGS can sequence millions of DNA fragments simultaneously.
Scale and Cost:
- Sanger: Costly for large-scale sequencing; suited for short sequences or specific regions (e.g., small genes).
- NGS: Economical for large-scale projects, enabling whole-genome or transcriptome sequencing.
Speed:
- Sanger requires time-intensive electrophoresis for each reaction.
- NGS processes multiple fragments in parallel, generating results within days.
Applications:
- Sanger: Gold standard for confirming specific mutations or small-scale sequencing.
- NGS: Ideal for genomic research, clinical diagnostics, and large-scale studies like GWAS.

Core Principles of Illumina NGS

The Illumina sequencing platform relies on sequencing by synthesis (SBS). The process consists of four key steps:

1. Library Construction

Objective: Prepare DNA for sequencing by breaking it into small fragments and adding adapters for compatibility with the sequencing platform.
Steps:
1. Fragmentation: Shear DNA into 200–300 bp fragments using sonication or enzymes.
2. End Repair: Modify DNA ends to allow ligation of adapters.
3. Adapter Ligation: Add adapter sequences to each fragment. Adapters include:
  - Sequencing primers binding sites.
  - Indices for sample identification in multiplex sequencing.
  - Flow cell anchors for attachment during cluster generation.
4. Size Selection: Select and purify fragments of the desired size using beads or gel electrophoresis.

2. Cluster Generation

Objective: Amplify DNA fragments to detectable levels by creating clusters of identical sequences on a solid surface (flow cell).
Steps:
1. Hybridization: Single-stranded DNA library fragments bind to complementary oligonucleotides on the flow cell.
2. Bridge PCR: DNA bends to form a bridge, and amplification occurs directly on the flow cell surface, creating clusters.
3. Denaturation: Double-stranded clusters are denatured, leaving single-stranded DNA templates for sequencing.

3. Sequencing by Synthesis (SBS)

Objective: Determine the nucleotide sequence of each DNA fragment by synthesizing complementary strands base-by-base.
Steps:
1. Nucleotide Incorporation: A DNA polymerase adds one fluorescently labeled, reversible terminator nucleotide (A, T, G, or C) at a time.
2. Imaging: After each nucleotide addition, a high-resolution camera captures fluorescence signals.
3. Cleavage: The fluorescent tag and terminator group are removed, enabling the addition of the next nucleotide.
4. Iteration: Steps are repeated for up to 250 cycles, yielding read lengths of 50–300 bp.

4. Data Analysis

Objective: Convert raw sequencing signals into biologically meaningful data.
Steps:
1. Base Calling: Fluorescent signals are translated into nucleotide sequences.
2. Alignment: Reads are aligned to a reference genome to identify variations.
3. Quantification: For RNA-seq, quantify gene expression levels.
4. Annotation: Classify variants (e.g., SNPs, insertions, deletions).

Applications of NGS

1. Whole Exome Sequencing (WES)

Objective: Sequence all protein-coding regions (exons) in the genome (~1-2% of the genome).
Steps:
1. Target Enrichment: Use probes to capture exon sequences from fragmented DNA.
2. Library Preparation: Construct libraries for sequencing.
3. NGS Sequencing: Sequence the captured regions.
4. Analysis:
  - Align reads to a reference genome.
  - Identify variants (e.g., pathogenic mutations in disease-related genes).
Advantages:
- Focuses on regions where most disease-causing mutations occur.
- Faster and cheaper than whole-genome sequencing.

2. RNA-seq

Objective: Analyze the transcriptome to measure gene expression levels and identify alternative splicing or fusion transcripts.
Steps:
1. RNA Extraction: Extract total RNA or mRNA.
2. cDNA Synthesis: Convert RNA into complementary DNA (cDNA) using reverse transcription.
3. Library Construction: Prepare cDNA for sequencing by adding adapters.
4. Sequencing: Perform NGS using SBS.
5. Analysis:
  - Map reads to the reference transcriptome.
  - Quantify expression levels based on read counts.
  - Identify differentially expressed genes.
Applications:
- Compare gene expression between conditions (e.g., healthy vs. diseased tissue).
- Discover novel transcripts and isoforms.

Summary

NGS technologies like Illumina sequencing have revolutionized genomics, offering unparalleled speed, accuracy, and scale compared to traditional Sanger sequencing. Core applications include:

WES: Efficiently identifies disease-causing mutations in exons.
RNA-seq: Captures comprehensive transcriptome data for expression analysis.
Advantages: NGS is increasingly used in research and clinical diagnostics due to its scalability and cost-efficiency.

If you'd like further details on any specific aspect, feel free to ask!