Genomics and Transcriptomics – Comprehensive Study Notes

Learning Outcomes

• Define and contrast genetics, genomics and tratranscriptomics Describe the three sub-fields of genomics: Functional, Structural, Comparative.
• Explain the key technologies (e.g. microarrays, SAGE, RNA-Seq, ChIP-Seq, X-ray crystallography) that enable these fields.
• Illustrate how these tools illuminate gene/protein function, disease mechanisms, drug discovery and personalised medicine.

Core Definitions & Conceptual Distinctions

Genetics – traditionally focuses on single genes and their individual roles in heredity.

Genomics – coined by Tom Roderick (Jackson Laboratory, 1986); the holistic study of the entire genome (DNA sequence, organisation, expression and inheritance) at DNA, mRNA, protein, cellular and tissue levels.

Transcriptomics – examination of the complete set of RNA transcripts (the transcriptome) produced by a cell/tissue under specific conditions, often called expression profiling.

Genomics: Scope & Rationale

• Central goal: determine the full sequence of genomic molecules and link sequence to biological function.
• A mere raw sequence is insufficient – annotation (gene finding, regulatory elements) requires sophisticated bioinformatics.
• Genomic knowledge underpins evolutionary studies, disease gene discovery, therapeutic target identification and synthetic biology.

Sub-fields of Genomics

Functional Genomics

Goal: Clarify the function and regulation of DNA, RNA and proteins at a system-wide scale.

Key attributes:
• Relies on high-throughput assays to quantify gene/RNA/protein abundance and interactions across many conditions.
• Generates dynamic datasets capturing regulation in response to stimuli, development or environment.

Principal Technologies
  1. Microarrays – glass chips bearing \text{\small(10^3!!–!10^6)} synthetic oligonucleotide probes.
    • Sample DNA/mRNA is fragmented, labelled and hybridised; fluorescence intensity expression level.
    • High-intensity signal ⇒ up-regulated gene.
    • Types
    – Expression-profiling microarrays (≤50-bp probes from cDNA/exons).
    – Tiling microarrays (overlapping ≤50-bp probes across Mbp regions; map transcription-factor binding or epigenetic marks).

  2. Serial Analysis of Gene Expression (SAGE)
    • Uses 10\text{–}17-bp sequence tags derived from poly-A mRNA.
    • Tags are concatenated and sequenced; counts give unbiased transcript abundance without prior gene knowledge.

  3. High-Throughput Sequencing (HTS)
    RNA-Seq – isolates RNAs → cDNA → deep sequencing; quantifies, discovers and profiles transcripts.
    ChIP-Seq – combines chromatin immunoprecipitation with sequencing to pinpoint protein–DNA interaction sites.

Structural Genomics

Objective: Determine 3-D structures of all protein isoforms encoded by a genome.

Rationale & Significance:
• Structure provides direct clues to protein function, catalytic sites, interaction surfaces and evolutionary relationships.
• Informs drug design, protein engineering, and understanding of folding dynamics/misfolding diseases.

Structural Determination Strategies
  1. De novo (Experimental) – Clone every ORF, express, purify, crystallise; solve structure via X-ray crystallography or Nuclear Magnetic Resonance (NMR).

  2. Ab initio Modelling – Predict 3-D fold solely from amino-acid sequence and physicochemical principles.

  3. Sequence-based (Homology) Modelling – Align unknown protein to known structures; build model based on homologues.

  4. Threading (Fold Recognition) – Fit sequence onto all known structural folds to find best structural template.

Comparative Genomics

Approach: Computer-aided comparison of complete genomes across species/individuals.

Applications:
• Identify conserved vs. divergent regions to infer essential functional elements and evolutionary pressures.
• Examine genome size, gene count, chromosome number under varying conditions.
Homology analysis aligns orthologous sequences (e.g., human gene vs. mouse) to map conserved motifs.

Medical relevance:
Genomic medicine / personalised medicine – Integrates next-generation sequencing (NGS) data to tailor care to a patient’s genomic profile.
Cancer genomics – Detects tumour-specific mutations; guides design of mutation-targeted therapeutics.

Transcriptomics

What is the Transcriptome?

The full complement of RNAs (mRNA, rRNA, tRNA, long/short non-coding RNA) produced by a cell, tissue or organism at a specific stage or physiological state.

Aims of Transcriptomics

• Quantify gene expression patterns.
• Compare normal vs. diseased or treated vs. untreated conditions to uncover pathways of pathogenesis or drug response.

Experimental Platforms

  1. Hybridisation-based (Microarrays)
    – Fluorescently labelled cDNA hybridised to custom or commercial oligo arrays.

  2. Sequencing-based (RNA-Seq & derivatives)
    – Sanger sequencing of cDNA libraries (historic), SAGE, Massively Parallel Signature Sequencing (MPSS), modern NGS RNA-Seq.

Medical Applications

Diagnostics & Disease Profiling – RNA-Seq detects disease-associated SNPs, allele-specific expression, gene fusions.
Human & Pathogen Transcriptomes – Quantifies virulence-factor expression, predicts antibiotic resistance, maps host–pathogen interactions; drives infection-control strategies and individualised therapy design.

Integrated View: Technological Synergy

• Genomic sequencing informs probe design for microarrays and provides reference genomes for RNA-Seq alignment.
• Transcriptomic data feed back into functional genomics by revealing condition-specific expression modules.
• Structural data validate functional hypotheses and enable rational drug design against genomic & transcriptomic targets.
• Bioinformatics frameworks unify these datasets, enabling multi-omics insights.

Ethical, Philosophical & Practical Considerations

Data Privacy – Whole-genome and transcriptome profiles are personally identifying; secure storage and informed consent are critical.
Equitable Access – Personalised medicine could widen health-care disparities if expensive genomic tests/treatments are unequally distributed.
Evolutionary Insight vs. Species Difference – While conservation highlights essential biology, over-extrapolation across species can mislead; functional validation remains necessary.

Numerical & Statistical Highlights

• Microarray chips routinely interrogate 10^4–10^6 loci in parallel.
• SAGE tags length: 10\text{–}17\,\text{bp}, providing unique gene identifiers.
• RNA-Seq depth can exceed 10^8 reads/sample, enabling detection of rare transcripts (<1 copy per cell).

Lecture Recap & Key Take-Home Messages

  1. Genomics subdivides into Functional, Structural, Comparative branches, each with distinct goals and toolkits.

  2. Technologies such as microarrays, SAGE, RNA-Seq, ChIP-Seq are foundational for large-scale functional insight.

  3. Transcriptomics focuses on RNA dynamics, advancing diagnostics, pathogen biology and therapeutic target discovery.

  4. Cross-integration of genomic, transcriptomic and structural data propels personalised medicine and novel drug development.

  5. Bioinformatics is indispensable for transforming raw sequence data into actionable biological knowledge.