RNA-Seq Sample Preparation & Library Construction

Full RNA-seq Workflow Overview

  • Entire analytical pipeline
    • Begin with a clearly stated, directional, testable hypothesis involving 2\ge 2 variables.
    • Execute biological experiment according to that hypothesis.
    • Collect samples → isolate & purify RNA → quantify RNA.
    • Wet-lab finishes with library preparation and high-throughput sequencing.
    • Bio-informatics (covered in later course sections):
    • Raw read quality control.
    • Alignment to a reference genome.
    • Quantification of gene expression per sample.
    • Differential-expression testing between conditions.
    • Biological interpretation & visualization.

Experimental Design & Planning

  • Core considerations (apply to all assays):
    • Sample type, size, collection method, preservation strategy.
    • Assay choice dictates statistical power requirements.
    • RNA-seq often achieves significance with n4n\approx4 replicates per group.
    • Bone µCT might need n=1012n=10\text{–}12.
    • Think through variables & controls (organism, tissue, genotype, environmental perturbation, etc.)
    • Plan for result interpretation before you start.
Spaceflight-specific nuances
  • Timing post-perturbation (e.g. microgravity, radiation):
    • Splash-down to lab can take 72 h\approx 72\text{ h} → gene expression drifts back toward 1g1\,g baseline.
    • Best practice: dissect/collect on-orbit if true microgravity signature is desired.
  • Timing post-euthanasia:
    • RNases rapidly degrade RNA once physiological safeguards stop.
    • Delays between euthanasia and tissue freezing → dramatic RIN drop.
Sample Preservation Choices
  • Snap-freeze in liquid N$_2$ vs. RNA-stabilizing reagents (e.g. RNAlater).
  • Storage duration (e.g. months on ISS) must be reconciled with RNA stability.

RNA Integrity & RNase Avoidance

  • RNase ubiquity: endogenous & on surfaces.
  • Best practices:
    • Wipe benches/tools with RNase-Away.
    • Consistent glove use; change often.
    • Keep samples on ice (≈00^\circC); final isolated RNA at 80-80^\circC.
  • High-quality input RNA ⇒ high-quality sequencing output.

RNA Extraction Methodologies

  • Classical organic separation (TRIzol/phenol-chloroform):
    • Homogenize tissue → lyse → centrifuge → three-phase separation.
    • Bottom organic (proteins, pink hue).
    • Interphase (DNA/chromatin, white disk).
    • Top aqueous (RNA, clear).
    • Carefully aspirate RNA; treat with DNase to remove carry-over.
  • Column-based kits:
    • Silica membrane binds RNA; DNase treat on-column; elute with RNase-free buffer.
    • Tissue-specific optimization mandatory.
Homogenization Tools
  • Mechanical homogenizers (bones, hard tissues).
  • Direct lysis in chaotropic solvents (cell culture).

RNA Quality Control (post-extraction)

  • Agarose gel visualization:
    • Distinct 28S28\,S and 18S18\,S rRNA bands (plus faint 5S5\,S).
  • Bioanalyzer / TapeStation electropherograms:
    • Generate RNA Integrity Number (RIN) [1,10]\in[1,10].
    • Criteria:
    • RIN7\text{RIN}\ge 7 required for most mRNA-seq.
    • Partially degraded: shallow 28S28\,S, extra low-MW peaks.
    • Severely degraded: no 28S/18S28\,S/18\,S peaks, RIN 3\le 3.

Cellular RNA Composition

  • 90%\sim 90\% ribosomal RNA.
  • Minor fractions: tRNA, snoRNA, miRNA, siRNA, lncRNA.
  • mRNA (poly-adenylated, coding) = 15%1\text{–}5\% → library prep must enrich it.

Library Preparation – Step-by-Step

  1. Enrich or deplete input RNA
    • Poly-A selection (preferred):
      • Beads with oligo-dT probes capture poly-A tails under magnetic field.
      • Result: 78%\approx78\% exonic/coding content.
    • Ribo-depletion:
      • Probes hybridize rRNA; bound fraction discarded.
      • Exonic content 48%\approx48\% but retains non-poly-A species.
  2. Random fragmentation of RNA (heat/chemical):
    • Generate unbiased size distribution.
    • Fragment RNA, not cDNA, to avoid restriction-site bias.
  3. First- & second-strand cDNA synthesis
    • Use random hexamer primers (6-mers) + reverse transcriptase → ds-cDNA matching original fragments.
  4. Size selection (magnetic beads):
    • Remove fragments <!200\text{ bp} to improve cluster generation & read quality.
    • Poor-quality/degraded RNA loses many fragments here.
  5. A-tailing & adapter ligation
    • Add a single-stranded AA overhang → ligate Illumina P5/P7 adapters.
    • Watch for adapter dimers (≈100125 bp100\text{–}125\text{ bp}). Must be minimized.
  6. Library amplification (PCR, ≈1515 cycles):
    • Introduces flow-cell binding sequences, sample indices.
    • Over-amplification → coverage bias; use minimal cycles.
  7. Final QC of library
    • Fragment size distribution: target 200500 bp200\text{–}500\text{ bp}, mean 250 bp\approx250\text{ bp}.
    • Concentration quantification: qPCR, Qubit, etc.
    • Adapter dimer proportion: keep <5%5\% ideally.
Adapter Geometry Recap
  • Final molecule = [P5]+[Index 1]+Insert+[Index 2]+[P7]\text{[P5]} + \text{[Index 1]} + \text{Insert} + \text{[Index 2]} + \text{[P7]}.
  • If using paired-end 100100 ( 2×1002\times100) sequencing and adapter lengths 65\approx 65 bp each:
    • Desired library size LL:
      L=2×100(read)+65+65(adapters)overlap300bp.L = 2\times100\,\text{(read)} + 65+65\,\text{(adapters)} - \text{overlap} \approx 300\,\text{bp}. (Allow overlap 3040\approx 30\text{–}40 bp.)

Avoiding Bias Across Workflow

  • Random fragmentation + random primers → uniform transcript representation.
  • Remove small/degraded fragments → consistent insert size.
  • Limit PCR cycles → avoid amplification bias & chimera formation.
  • Continual QC checkpoints guard against data loss and misinterpretation.

Key Reagents, Enzymes & Tools

  • Reverse Transcriptase, DNA ligase, RNase-free DNase.
  • PCR polymerase, primers.
  • Magnetic beads for both purification and size selection.
  • Indexing adapters (single or dual indices).
  • Restriction enzymes (limited role; avoided for fragmentation).

Practical / Ethical / Philosophical Notes

  • Ethical responsibility to design statistically sound experiments, minimizing animal use while ensuring rigorous conclusions.
  • In spaceflight studies, logistics of animal handling intersect with the ethics of both humane care and scientific validity.
  • Proper sample stewardship (RNase control, storage) protects irreplaceable biological data—especially when retrieval opportunities (e.g. ISS down-mass) are scarce.

Connections to Previous Lectures

  • Builds on earlier discussion of gel electrophoresis principles for nucleic-acid QC.
  • Sets stage for upcoming module on sequencing chemistry & flow-cell clustering.

Recap Checklist

  • [ ] Hypothesis framed & variables defined.
  • [ ] Adequate sample size (e.g. n4n\ge4 for RNA-seq).
  • [ ] Collection timing & preservation optimized.
  • [ ] RNase-free extraction completed; RIN 7\ge7 confirmed.
  • [ ] Chosen enrichment method (poly-A or ribo-depletion) executed.
  • [ ] Fragmentation, cDNA synthesis, size selection performed without bias.
  • [ ] Adapters ligated; dimers removed.
  • [ ] Library amplified within 15\le15 cycles.
  • [ ] Final QC: fragment 250\approx250 bp, minimal adapter dimers, quantified for flow-cell loading.
  • Ready for next lecture: sequencing principles & data generation.