Microbial Genomics

Introduction to Microbiology BIOL-2026 E

  • Instructor: Professor Omri

  • Exclusive copyright for enrolled students for Fall term 2025 at Laurentian University.

  • Unauthorized use of lectures and materials is prohibited.

Course Overview

  • Course: BIOL-2026 E: Introduction to Microbiology

  • Focus: Microbial Genomics

  • Date: November 24, 2025

Genomics

Definition of Genome

  • The genome refers to the entire complement of genetic information, which includes:

    • Genes: sequences that encode for proteins or functional RNA.

    • Regulatory Sequences: regions controlling the expression of genes, such as promoters and enhancers.

    • Non-Coding Sequences: parts of DNA that do not code for proteins but may have structural or regulatory functions.

Definition of Genomics

  • Genomics is the discipline involved in:

    • Mapping: locating genes and important regions within the genome.

    • Sequencing: determining the precise order of DNA bases (A, T, C, G).

    • Analyzing and Comparing: interpreting genetic data for studying similarities, differences, and evolutionary relationships.

Functional Genomics

  • Functional genomics aims to assign functions to unknown genes by analyzing the biochemical and physiological effects of various mutants.

  • The field necessitates:

    • Improved DNA sequencing techniques

    • Large data storage formats

    • Tools for the analysis of vast datasets

DNA Sequencing Techniques

Sanger/Dideoxy DNA Sequencing

Process
  1. Cloning: The DNA fragment of interest is cloned.

  2. Synthesis: DNA is synthesized using chain-terminating nucleotides that stop replication at specific points.

  3. Electrophoresis: Fragments are separated by size to determine the sequence.

Improvements
  • Fluorescent dyes now replace radioactive labels, enhancing safety and speed.

  • Utilizes techniques such as primer walking for sequencing longer DNA regions.

Output
  • The result is a DNA sequencing chromatogram:

    • Each peak represents a nucleotide.

    • Color coding:

    • Red peaks = Thymine (T)

    • Green peaks = Adenine (A)

    • Blue peaks = Cytosine (C)

    • Black peaks = Guanine (G)

Reaction Components

  • Starting Materials Needed:

    • Template DNA strand

    • DNA primer complementary to the sequence

    • Regular DNA nucleotides (deoxyribonucleotide triphosphates, dNTPs)

    • Dideoxynucleotides (ddNTPs)

    • DNA polymerase enzyme

Chain Termination Mechanism
  • DNA polymerase extends the primer using regular nucleotides.

  • Upon incorporation of ddNTPs, the chain stops due to the absence of the 3' OH group.

  • This results in fragments of varying lengths, each ending in a specific ddNTP.

Gel Electrophoresis

  • Fragments are separated on a polyacrylamide gel based on size:

    • Shorter fragments migrate faster, longer fragments move slower.

    • Each band on the gel indicates where chain termination occurred, allowing sequence reading from the bottom up.

Modern Advances in Sequencing
  • Automated methods utilize:

    • Fluorescently labeled ddNTPs (varying colors)

    • One-tube reactions

    • Automated detection and computational analysis of results, leading to rapid sequencing.

  • Sanger sequencing remains significant for small-scale projects due to its accuracy.

Next-Generation Sequencing (NGS)

Definition

  • NGS pertains to advanced sequencing technologies that enable rapid, large-scale sequencing of DNA or RNA.

Key Concepts

  1. High Throughput:

    • Ability to process millions to billions of DNA molecules simultaneously.

    • Significantly reduces time and costs in sequencing.

  2. Massively Parallel Techniques:

    • Multiple DNA or RNA fragments sequenced simultaneously, increasing efficiency and speed.

NGS Platforms/Approaches

  1. Pyrosequencing:

    • Detects nucleotide incorporation in real-time by measuring light emitted when pyrophosphate is released.

    • Mechanism: Incorporation of a nucleotide by DNA polymerase releases pyrophosphate which drives a luciferase reaction, producing measurable light.

    • Applications: Short reads and region-specific sequencing.

  2. Semiconductor Sequencing (Ion Torrent):

    • Utilizes semiconductor technology to detect hydrogen ions released during nucleotide incorporation.

    • Mechanism: The release of hydrogen ions alters pH, which is detected by a semiconductor sensor.

    • Advantages: Cost-effective, real-time data acquisition.

    • Applications: Targeted gene sequencing and microbial genomics.

  3. Sequencing by Synthesis (SBS; Illumina):

    • Widely used NGS method.

    • Mechanism: Fluorescently labeled nucleotides are added, with detection through laser excitation after each incorporation.

    • Advantages: High accuracy and capability of producing long reads.

    • Applications: Suitable for whole-genome and transcriptomic sequencing.

Comparative Analysis of Sequencing Methods

Feature

Sanger Sequencing

Illumina (NGS)

Ion Torrent

PacBio

Basic Method

Chain termination with ddNTPs

Sequencing by synthesis with fluorescent nucleotides

Semiconductor detection of H+ release

Single-molecule real-time sequencing

Read Length

500-1000 bp

150-300 bp

200-400 bp

>10,000 bp (up to 100,000 bp)

Accuracy

>99.99%

99.9%

98-99%

87-92% single pass; >99% with multiple passes

Throughput

Low (1 sample at a time)

Very High (billions of reads)

High (millions of reads)

Medium (hundreds of thousands)

Time per Run

2-3 hours

1-3 days

2-4 hours

0.5-4 hours

Cost per Gb

$500-1000

$5-10

$10-20

$50-100

Considerations for Method Choice

  1. Project scale and scope.

  2. Budget constraints.

  3. Required accuracy levels.

  4. Time requirements for completion.

  5. Available computational resources.

Bioinformatics

Definition

  • Bioinformatics bridges biology and technology, utilizing computers and software tools to analyze, assemble, and store biological data, including DNA and protein sequences.

Key Concepts

  • Genome Annotation: Identifying important genomic features:

    • Includes recognizing Open Reading Frames (ORFs) which indicate start and stop points of genes.

  • While sequencing generates raw data, functional genomics seeks to elucidate gene functions.

Related Fields

Transcriptomics

  • Study of the transcriptome, encompassing all mRNA molecules transcribed from DNA in a cell.

  • A complementary DNA (cDNA) library is generated by converting mRNA to cDNA using reverse transcriptase.

  • Accelerated transcriptomics due to advances in sequencing methods.

Proteomics

  • Study of the proteome, or complete set of proteins expressed by a cell.

  • Challenges exist in protein sequencing compared to DNA, requiring specialized tools for analysis.

Metagenomics

Definition
  • Focuses on genetic material sourced from environmental samples (e.g., soil, water).

Steps in Metagenomics
  1. Extract DNA from environmental samples.

  2. Sequence using NGS techniques.

  3. Analyze for:

    • Discovery of new genes.

    • Understanding microbial community metabolic potentials.

Impact
  • Has significantly changed the understanding of life, uncovering previously invisible organisms.

  • Potential applications in biotechnology and nanotechnology.

Applications of Genomics

  1. Medicine: Developing personalized treatments, oncology genomics, and identifying disease-related genes.

  2. Agriculture: Enhancing crop yield and resistance via genetic engineering.

  3. Environmental Science: Investigating microbial ecosystems and advancing bioremediation strategies.

  4. Biotechnology: Utilizing genes identified through metagenomics in industry-related applications.

Conclusion

  • Genomics integrates biological understanding with technological advancements to explore life complexities, driving progress in evolutionary theory, medical science, and molecular biology.

  • Continued advancements in sequencing technologies, bioinformatics, and metagenomics broaden scientific frontiers, promoting innovations across diverse fields.