Week 11 A - Transcriptomics Notes

Molecular Genetics: Transcriptomics

Learning Outcomes

  • Understand basic principles of gene expression analysis.
  • Gain insight into techniques frequently used for transcriptome analysis: RT-PCR, RNAseq, microarray.
  • Explain the use of these techniques in medicine or research by describing examples.

Biological Systems Multi-Omics

  • Genomic level:
    • SNP (Single Nucleotide Polymorphism)
    • CNV (Copy Number Variation)
    • LOH (Loss of Heterozygosity)
    • Genomic rearrangement
    • Rare variant
  • Epigenome level:
    • DNA methylation
    • Histone modification
    • Chromatin accessibility
    • TF binding (Transcription Factor binding)
    • miRNA
  • Transcriptome level:
    • Gene expression
    • Alternative splicing
    • Long non-coding RNA
    • Small RNA
  • Proteome level:
    • Protein expression
    • Post-translational modification
    • Cytokine array
  • Metabolome level:
    • Metabolite profiling in serum, plasma, urine, CSF, etc.
  • Phenome level:
    • Cancer
    • Metabolic syndrome
    • Psychiatric disease

Transcriptome

  • Defined as the complete set of transcripts in a cell and their quantity for a specific developmental stage or physiological condition.
  • The full range of RNA molecules (messenger RNAs and a variety of non-coding RNAs) expressed by the organism, tissue, or cell type.
  • The transcriptome actively changes.

Transcriptomics

  • The study of the transcriptome – the complete set of RNA transcripts produced by the genome under specific circumstances or in a specific cell – using high-throughput methods.
  • Comparison of transcriptomes allows the identification of genes that are differentially expressed in distinct cell populations or in response to different treatments.

Before Transcriptomics: Measuring Gene Expression

  • Northern blotting
  • EST sequencing (Expressed Sequence Tags): Sanger sequencing of random individual transcripts from cDNA libraries generated by reverse transcriptase.
  • RT-PCR/Q-PCR

Northern Blot

  1. RNA isolation.
  2. Separation of RNA on denaturing (formaldehyde agarose; eliminates RNA secondary structures) gel by size.
  3. Transfer RNA separated by size onto a membrane.
  4. Hybridize with labeled DNA-probe.

RT-qPCR

  • cDNA synthesis using:
    • Oligo(dT)s: 5TTTTTTTT35'-TTTTTTTT-3'
    • Random Primers: 5NNNNNN35'-NNNNNN-3'
    • Sequence-specific Primers: 5ACTTCGAAG35'-ACTTCGAAG-3'
      3UGAAGCUUC53'-UGAAGCUUC-5'
  • qPCR with sequence-specific primers.
  • Use of labeled (fluorescent) primers (TaqMan probes).
  • OR use of fluorescent dyes that incorporate into the double-stranded DNA helix (product of the PCR).
  • The more template is present for the PCR, the earlier the fluorescence is detectable.

Transcriptomics Technologies

  • DNA Microarray
  • RNA-SEQ

Transcriptomics Technologies Comparison

TechnologyAdvantagesLimitations
qPCR ArrayLow-cost; simpleOnly testing limited number of genes of interest in specific pathways
MicroarrayLow-cost; ability to process a large number of samples; high-throughputLow sensitivity for very lowly- or very highly expressed genes; high background; difficult to detect novel transcripts
RNA-seqHigh accuracy; high sensitivity and dynamic range; low background/noise signal; high-throughput; identify novel transcripts, splice junctions, SNPs, and non-coding RNAsHigh-cost; high data storage
  • SNPs: single nucleotide polymorphism.

Microarray

  • Uses chips that contain the microarray.
  • Microarrays comprise known DNA sequences (specific for the examined genome) spotted or synthesized on a small chip (the arrays of many tiny DNA oligonucleotide samples).
  • mRNA samples are reverse transcribed with labeled nucleotides.
  • Microarrays are based on competitive hybridization of the differently labeled cDNAs to the chip oligonucleotides.
  • Nucleic acid hybridization: using a known DNA fragment as a probe to find a complementary sequence.

Experiment Example: Microarray

  • Question: Can variation in gene expression, detected by microarrays, be used to predict the recurrence of breast cancer?
  • Methods:
    1. Microarray chip with DNA probes.
    2. Each spot consists of a different DNA probe fixed to a solid support, such as a nylon membrane or glass slide.
    3. Cancer and noncancer cells were removed from 78 women with breast cancer.
    4. Messenger RNA from the cells…
    5. …is converted into cDNA and labeled with red (cancer cells) or green (noncancer cells) fluorescent nucleotides.
    6. The cDNAs are mixed…
    7. …and hybridized to DNA probes on a chip.
    8. The chip is scanned spot by spot.
  • Results:
    • Yellow fluorescence (red + green) indicates equal expression of the gene in both types of cells; red indicates more expression in cancer cells; and green indicates more expression in noncancer cells.
    • Tumors above the solid yellow line came primarily from patients who remained cancer-free for at least 5 years.
    • Tumors below the solid yellow line came primarily from patients in whom the cancer spread within 5 years of diagnosis.
  • Conclusion: Seventy genes were identified whose expression patterns accurately predicted the recurrence of breast cancer within 5 years of treatment.

Microarray Example: miRNAs

  • Expression of miRNAs was compared in normal cells and cancer cells.
  • The color represents the degree of expression:
    • red indicates overexpression in cancer cells
    • green indicates underexpression
  • Some miRNAs were overexpressed in cancer cells compared to normal cells, while other miRNAs were underexpressed.

Use of DNA Microarrays

  • Genome-wide transcription analysis is performed using labeled cDNA from experimental samples hybridized to a microarray containing sequences from all ORFs of the organism being used.
  • SNP arrays permit genome-wide genotyping of single-nucleotide polymorphisms.
  • Array comparative genome hybridization (array-CGH) allows the detection of copy number changes in any DNA sequence compared between two samples.

DNA Microarrays in Medicine

  • The use of microarrays in the field of oncology, cardiovascular, inflammatory, and infectious diseases, as well as psychiatric disorders.
  • Discovery of target gene: the microarray is used to compare diseased tissues/cells with healthy tissues/cells to find the characteristics of a particular disease. This helps in finding the genes responsible for that disease.
  • Diagnostics: the microarray is used to know the state of disease, type of tumor, and other factors important for the patient. It is used to diagnose a number of diseases and infections, most notably cancer.
  • The discovery of drugs, pharmacogenomics: after the target has been discovered, microarrays can be used to screen potential compounds and identify the toxicity of the lead compound that will help in deciding proper medication for the patient; therapy on the basis of the genetic makeup, carrying out personalized treatments.

Limitations of Microarrays in Transcriptomics

  • Prior knowledge of gene sequences is required.
  • Cross-hybridization artifact: similar sequences hybridize to the same probe.
  • Limited ability to quantify the degree of gene expression.
  • Solution: Rapid, low-cost next-generation sequencing of cDNA obtained from RNA – RNA sequencing.

RNA-Seq

  • RNA-Seq is an approach to transcriptome profiling that uses deep-sequencing technologies.
  • RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods.

RNA-Seq Process

  1. Total cellular RNA is isolated from cells.
  2. The RNA of interest (e.g., mRNA) is isolated.
  3. The enzyme reverse transcriptase is used to make complementary DNA (cDNA) from mRNA.
  4. The cDNA is broken into overlapping fragments.
  5. Adapters with sequences for amplification and sequencing are added to the ends of the fragments.
  6. The fragments are amplified with PCR and sequenced using next-generation sequencing.
  7. The sequence reads are assembled into RNA transcripts.

RNA-Seq: Isolating Specific RNA Species

  • Poly-A selection
  • Ribo-depletion
  • Size selection

Splicing Quantitative Trait Locus (sQTL) Analysis

  • Quantifying Alternative Splicing by Using RNA-Seq Data.
  • Quantitative profiles of alternative splicing are treated as traits and tested for association with genotypes.
  • sQTL is defined as genetic variants that are associated with changes in the splicing ratios of transcripts.
  • Why?: to discover genetic variants that are associated with alternative splicing.

Spatial Transcriptomics

  • Why?: To understand where certain (few) genes express in a tissue
  • RNA sequencing with retaining information on the tissue context of the cells.

Spatial Transcriptomics - Technology

  • Each barcoded spot covers a 100μm100 \mu m area, containing 2 million oligonucleotides

Single-Cell Transcriptome Analyses

  • Types of analyses:
    • Single-cell RNA-seq
    • Expression profile clustering
  • Within cell type:
    • Stochasticity, variability of transcription
    • Regulatory network inference
    • Allelic expression patterns
  • Between cell types:
    • Identify biomarkers
    • (Post)-transcriptional differences
  • Between tissues:
    • Cell-type compositions
    • Altered transcription in matched cell types
    • Scaling laws of transcription

Somatic Mosaicism in Normal Tissues

  • Some mutations found in the DNA can be detected in the corresponding RNA, depending on the mutation allele fraction and sequence coverage.

RNA-Seq in Personalized Medicine

  • Use in diagnosis, prognosis, treatment, and monitoring of cancer through biopsy and liquid biopsy.

Summary - Transcriptomics

  • Transcriptome
  • Transcriptomics
  • Different approaches to transcriptome analysis: RT-PCR, microarray, and RNA-seq
  • Use of transcriptomics