RNA Molecules and RNA Processing

Overview: RNA Molecules & Their Fate

• Transcription only initiates RNA life; virtually every RNA is post-transcriptionally processed.
• Processing roles:
▸ Generate functional mature RNAs.
▸ Regulate activity/lifetime (timed degradation).
▸ Remove defective/mis-folded species.

Four General Processing Categories

• Removal of stretches of the primary RNA (excision, splicing, trimming).
• Addition of extra nucleotides to the 5′ and/or 3′ ends by template-independent enzymes.
• Removal, insertion, or replacement of single or multiple nucleotides (editing).
• Covalent base modification (methyl, amino, thiol, reductions).

Prokaryotes vs Eukaryotes – Big Picture

• Bacteria: few modifications; rRNA & tRNA transcribed as long precursors → cut; mRNA translated as-is.
• Eukaryotes: virtually every RNA modified; pre-mRNA receives especially extensive processing (capping, splicing, poly-A, editing).

tRNA Processing (shared logic in all domains)

• Endonucleases clip internal sequences; exonucleases trim termini.
• Universal CCA added to 3′ end by tRNA nucleotidyl-transferase – later amino-acylation site.
• Extensive base chemistry: methyl-, amino-, thiolation, reductions; many catalyzed by ribonucleoproteins (RNPs).
• Canonical sequence of events

  1. 5′ & 3′ ends removed.

  2. Internal intron excised (if present).

  3. Fragments ligated.

  4. CCA appended.

  5. Base modifications applied (varies by tRNA).
    • Structural outcome: cloverleaf of acceptor, D, anticodon, TΨC, variable arms containing numerous rare bases (e.g. ribothymidine, pseudouridine).

Ribosomes & rRNA

• Ribosome = protein-RNA machine; two subunits (large, small).
• Each subunit = 1\text{–}3 rRNAs + tens of proteins (e.g. bacteria 70S = 50S (23S + 5S, 31 proteins) + 30S (16S, 21 proteins)).

rRNA Gene Organization

• Eukaryotes: hundreds of tandem rDNA repeats.
• Most bacteria: multiple dispersed copies; E. coli uniquely single rDNA operon.

rRNA Precursors & Processing

• Prokaryotic 30S precursor → methylation marks → RNase III & others cut → 16S, 23S, 5S + sometimes tRNAs.
• Eukaryotic 45S precursor (nucleolus) → methylation & snoRNP-guided cleavage → 18S, 5.8S, 28S; separate 5S gene transcribed elsewhere.
• snoRNAs (small nucleolar RNAs) = ribozymes + proteins guiding base methylation & pseudouridylation.
• Assembly chronology: rRNAs synthesized/processed in nucleus; ribosomal proteins translated in cytosol, re-imported; subunits assembled in nucleolus → exported.

RNA Interference (RNAi) Toolbox

• Ancient RNA-dependent gene silencing; guards against viruses/transposons & regulates host genes.
• Common denominator: small (20\text{–}30\,\text{nt}) dsRNA guides + Argonaute-family endonucleases.
• Eukaryotic classes:
▸ siRNA (small interfering).
▸ miRNA (micro).
▸ piRNA (Piwi-interacting).
• Prokaryotic analogue: CRISPR crRNA.

siRNA

• Encoded by distinct loci; nearly perfect complementarity to target → AGO slicer degrades that mRNA (often the same locus).

miRNA

• Encoded within introns/exons of other genes; imprecise pairing → translational repression or deadenylation of unrelated mRNAs.

piRNA

• Germline only; long ss precursors processed by proteins (not Dicer) → 20–30 nt piRNAs bind Piwi proteins.
• Silences transposons at three layers: mRNA destruction, translation block, chromatin changes.

CRISPR crRNA (Bacteria/Archaea)

• Infection inserts short invader DNA into CRISPR array (spacers between palindromes).
• Transcription → pre-crRNA → CAS protein cleavage → mature crRNA bearing spacer.
• crRNA–CAS complex scans & cuts matching foreign DNA upon re-infection (adaptive immunity).

Long Non-coding RNAs (lncRNAs)

• Length ≈ \ge 100\,\text{nt}; thousands known; may represent \approx 80\% of human transcription while only \approx 1\% codes protein.
• Lack canonical ORFs; mechanistic themes: scaffold activators/repressors, chromatin remodelers, dosage compensation.

Messenger RNA (mRNA) Anatomy

• Regions:
▸ 5′ UTR (features Shine–Dalgarno in bacteria).
▸ Coding region (triplet codons).
▸ 3′ UTR (regulatory elements).

Eukaryotic pre-mRNA Processing Pipeline

  1. 5′ Capping.

  2. Splicing (introns removed, exons ligated).

  3. 3′ Polyadenylation.
    • Occurs co-transcriptionally in nucleus; mature mRNA exported & translated in cytosol.

Transcription–Processing Coupling

• Phosphorylation of RNA POL II C-terminal domain (CTD) shifts it to elongation mode & recruits capping, splicing, poly-A factors which transfer onto nascent RNA.

5′ Capping Details

• Enzyme triad (phosphatase, guanyl-transferase, methyl-transferase) riding POL II CTD.
• Steps:

  1. Remove one (\gamma)-phosphate from nascent 5′ triphosphate.

  2. Add GMP via unusual 5′→5′ triphosphate bridge.

  3. N7-methylation of cap guanosine (m⁷G).
    • Outcomes: marks authentic 5′ end, protects from exonucleases, binds export & translation factors.

Splicing Machinery

• Spliceosome = 5 snRNPs (U1, U2, U4, U5, U6) + ≥50 proteins; RNA catalytic centers.
• Reaction is two consecutive trans-esterifications creating a “lariat” intron:
▸ OH of branch point A attacks 5′ splice site.
▸ Freed 3′ OH of exon 1 attacks 3′ splice site → exons ligated; lariat released & degraded.
• Accuracy enhanced by multiple RNA-RNA checkpoints (U1/5′SS, BBP/U2AF/branch point, etc.).

Alternative Splicing

• Humans: typical gene contains 10!\text{–}!15 exons; yields 3!\text{–}!6 isoforms; up to 90\% of transcripts are alternatively spliced.
• Same pre-mRNA interpreted differently in tissues (e.g. (\alpha)-tropomyosin: striated muscle vs smooth muscle vs fibroblast vs brain).
• May generate non-functional proteins – potential evolutionary sandbox.

Tandem Chimerism & Trans-Splicing

• Exons from different primary transcripts can be spliced together.
• Most occur between adjacent genes on same chromosome; rarer inter-chromosomal combinations.
• Sense–antisense (opposite strands) combos termed trans-splicing.

Self-Splicing Introns

• Group I (bacteria, protists) – ribozyme excises itself.
• Group II (organelle RNAs, some bacteria) – ribozyme + protein complex; thought to be ancestor of modern spliceosome.

Splicing Plasticity

• Splice decisions respond to development, environment, stress; confer proteomic diversity without extra genes.

RNA Editing

• Post-transcriptional nucleotide changes (additions, deletions, substitutions) in mRNA, tRNA, rRNA.
• Guided by gRNAs that base-pair imperfectly → endonucleolytic cuts, gap-filling by polymerases using gRNA as template, ligation.
• Alters protein coding (e.g. DNA codes Ala-Leu-Tyr-Ala-Cys-Cys-Arg; splicing might yield Ala-Cys-Arg; editing may change to Ala-Gly-His).

3′ Polyadenylation

• DNA encodes poly-A signal (e.g. AAUAAA).
• POL II CTD-associated factors CstF & CPSF bind emerging signal, recruit cleavage proteins.
• Endonuclease cleaves downstream; poly-A-polymerase (PAP) adds \approx 200 AMP residues.
• Poly-A binding proteins coat tail, define final length, aid export & translation.

Nuclear Export Checkpoint

• Nuclear pore complex inspects mRNP: presence of required proteins (cap-binding complex, SR proteins, poly-A BP) & absence of intron-associated factors (snRNPs, CstF).
• Only fully processed mRNAs transit to cytoplasm for translation; faulty RNAs are retained & degraded.