RNA-Seq Sample Preparation & Library Construction
Full RNA-seq Workflow Overview
- Entire analytical pipeline
- Begin with a clearly stated, directional, testable hypothesis involving ≥2 variables.
- Execute biological experiment according to that hypothesis.
- Collect samples → isolate & purify RNA → quantify RNA.
- Wet-lab finishes with library preparation and high-throughput sequencing.
- Bio-informatics (covered in later course sections):
- Raw read quality control.
- Alignment to a reference genome.
- Quantification of gene expression per sample.
- Differential-expression testing between conditions.
- Biological interpretation & visualization.
Experimental Design & Planning
- Core considerations (apply to all assays):
- Sample type, size, collection method, preservation strategy.
- Assay choice dictates statistical power requirements.
- RNA-seq often achieves significance with n≈4 replicates per group.
- Bone µCT might need n=10–12.
- Think through variables & controls (organism, tissue, genotype, environmental perturbation, etc.)
- Plan for result interpretation before you start.
Spaceflight-specific nuances
- Timing post-perturbation (e.g.
microgravity, radiation):
- Splash-down to lab can take ≈72 h → gene expression drifts back toward 1g baseline.
- Best practice: dissect/collect on-orbit if true microgravity signature is desired.
- Timing post-euthanasia:
- RNases rapidly degrade RNA once physiological safeguards stop.
- Delays between euthanasia and tissue freezing → dramatic RIN drop.
Sample Preservation Choices
- Snap-freeze in liquid N$_2$ vs. RNA-stabilizing reagents (e.g. RNAlater).
- Storage duration (e.g. months on ISS) must be reconciled with RNA stability.
RNA Integrity & RNase Avoidance
- RNase ubiquity: endogenous & on surfaces.
- Best practices:
- Wipe benches/tools with RNase-Away.
- Consistent glove use; change often.
- Keep samples on ice (≈0∘C); final isolated RNA at −80∘C.
- High-quality input RNA ⇒ high-quality sequencing output.
- Classical organic separation (TRIzol/phenol-chloroform):
- Homogenize tissue → lyse → centrifuge → three-phase separation.
- Bottom organic (proteins, pink hue).
- Interphase (DNA/chromatin, white disk).
- Top aqueous (RNA, clear).
- Carefully aspirate RNA; treat with DNase to remove carry-over.
- Column-based kits:
- Silica membrane binds RNA; DNase treat on-column; elute with RNase-free buffer.
- Tissue-specific optimization mandatory.
- Mechanical homogenizers (bones, hard tissues).
- Direct lysis in chaotropic solvents (cell culture).
- Agarose gel visualization:
- Distinct 28S and 18S rRNA bands (plus faint 5S).
- Bioanalyzer / TapeStation electropherograms:
- Generate RNA Integrity Number (RIN) ∈[1,10].
- Criteria:
- RIN≥7 required for most mRNA-seq.
- Partially degraded: shallow 28S, extra low-MW peaks.
- Severely degraded: no 28S/18S peaks, RIN ≤3.
Cellular RNA Composition
- ∼90% ribosomal RNA.
- Minor fractions: tRNA, snoRNA, miRNA, siRNA, lncRNA.
- mRNA (poly-adenylated, coding) = 1–5% → library prep must enrich it.
Library Preparation – Step-by-Step
- Enrich or deplete input RNA
- Poly-A selection (preferred):
- Beads with oligo-dT probes capture poly-A tails under magnetic field.
- Result: ≈78% exonic/coding content.
- Ribo-depletion:
- Probes hybridize rRNA; bound fraction discarded.
- Exonic content ≈48% but retains non-poly-A species.
- Random fragmentation of RNA (heat/chemical):
- Generate unbiased size distribution.
- Fragment RNA, not cDNA, to avoid restriction-site bias.
- First- & second-strand cDNA synthesis
- Use random hexamer primers (6-mers) + reverse transcriptase → ds-cDNA matching original fragments.
- Size selection (magnetic beads):
- Remove fragments <!200\text{ bp} to improve cluster generation & read quality.
- Poor-quality/degraded RNA loses many fragments here.
- A-tailing & adapter ligation
- Add a single-stranded A overhang → ligate Illumina P5/P7 adapters.
- Watch for adapter dimers (≈100–125 bp). Must be minimized.
- Library amplification (PCR, ≈15 cycles):
- Introduces flow-cell binding sequences, sample indices.
- Over-amplification → coverage bias; use minimal cycles.
- Final QC of library
- Fragment size distribution: target 200–500 bp, mean ≈250 bp.
- Concentration quantification: qPCR, Qubit, etc.
- Adapter dimer proportion: keep <5% ideally.
Adapter Geometry Recap
- Final molecule = [P5]+[Index 1]+Insert+[Index 2]+[P7].
- If using paired-end 100 (
2×100) sequencing and adapter lengths ≈65 bp each:
- Desired library size L:
L=2×100(read)+65+65(adapters)−overlap≈300bp. (Allow overlap ≈30–40 bp.)
Avoiding Bias Across Workflow
- Random fragmentation + random primers → uniform transcript representation.
- Remove small/degraded fragments → consistent insert size.
- Limit PCR cycles → avoid amplification bias & chimera formation.
- Continual QC checkpoints guard against data loss and misinterpretation.
- Reverse Transcriptase, DNA ligase, RNase-free DNase.
- PCR polymerase, primers.
- Magnetic beads for both purification and size selection.
- Indexing adapters (single or dual indices).
- Restriction enzymes (limited role; avoided for fragmentation).
Practical / Ethical / Philosophical Notes
- Ethical responsibility to design statistically sound experiments, minimizing animal use while ensuring rigorous conclusions.
- In spaceflight studies, logistics of animal handling intersect with the ethics of both humane care and scientific validity.
- Proper sample stewardship (RNase control, storage) protects irreplaceable biological data—especially when retrieval opportunities (e.g. ISS down-mass) are scarce.
Connections to Previous Lectures
- Builds on earlier discussion of gel electrophoresis principles for nucleic-acid QC.
- Sets stage for upcoming module on sequencing chemistry & flow-cell clustering.
Recap Checklist
- [ ] Hypothesis framed & variables defined.
- [ ] Adequate sample size (e.g. n≥4 for RNA-seq).
- [ ] Collection timing & preservation optimized.
- [ ] RNase-free extraction completed; RIN ≥7 confirmed.
- [ ] Chosen enrichment method (poly-A or ribo-depletion) executed.
- [ ] Fragmentation, cDNA synthesis, size selection performed without bias.
- [ ] Adapters ligated; dimers removed.
- [ ] Library amplified within ≤15 cycles.
- [ ] Final QC: fragment ≈250 bp, minimal adapter dimers, quantified for flow-cell loading.
- Ready for next lecture: sequencing principles & data generation.