Life and Modifications of Messenger RNA: A Comprehensive Study Guide
- Post-transcriptional modifications are conceptualised as occurring after transcription but largely occur cotranscriptionally in reality.
- RNA Polymerase II (Pol II) contains a Carboxy-Terminal Domain (CTD) with various phosphorylation states throughout elongation.
- Proteins responsible for processing the RNA are associated with the CTD, allowing processing to occur in almost real-time as the RNA emerges from the polymerase.
- By the time Pol II finishes transcription and disassembles, the RNA is typically fully processed.
- The term "post-transcriptional" is maintained in literature for conceptual clarity and due to historical convention.
The Molecular Definition of a Gene
- A gene is a genomic region (DNA or RNA) that codes for a functional product.
- Historical perspective: In the mid-20th century, the concept was "one gene, one enzyme/protein."
- Contemporary understanding:
- A single gene can encode multiple proteins through alternative processes.
- Many genes encode non-coding RNAs (ncRNAs) with specific functions.
- Functionality is the key criterion; if a product's function cannot be evaluated, categorising it as a gene is difficult.
- Genes are often interrupted (discontinuous) and can overlap on the genome.
- In a specific DNA region, multiple genes can exist: some coding for proteins, others for ncRNAs, overlapping or nested within one another.
RNA Structural Dynamics
- RNA is not a simple linear string; it behaves like headphone cables or Christmas tree lights.
- It tends to fold and coil spontaneously upon itself.
- Structure influences protein interaction; regions emerging early from the polymerase are more accessible before high-order folding occurs.
The 5' Capping Process
- Capping involves placing a "cap" or "lid" on the 5′ end of the messenger RNA (mRNA) to protect it and facilitate downstream processes.
- Mechanism: Three enzymatic reactions usually performed by proteins associated with the CTD of Pol II.
- 1. Trifosphatase: Removes one phosphate from the triphosphate group at the 5′ end of the nascent RNA.
- 2. Guanylyltransferase: Adds a Guanine residue (GTP) in an inverted orientation (5′−5′ linkage).
- 3. Methyltransferase: Adds a methyl group to the Guanine at the Carbon 7 position (m7G).
- Biological Utility of the Cap:
- Prevents degradation by exonucleases (enzymes that degrade free ends of nucleic acids) via steric hindrance.
- Protects the Guanosine from glycosylases by methylating Carbon 7.
- Variations:
- Cap 0: Standard cap found in most organisms (7−methylguanosine).
- Cap 1, 2, 3, 4: Additional methylations on the first several nucleotides of the transcript, common in specific organisms.
- Genomic Organisation: In metazoans and plants, nucleotidyltransferase and phosphatase activities are often on one polypeptide, while methyltransferase is on another.
The Cap Binding Complex (CBC)
- The CBC consists of two primary proteins: CBP80 (80kDa) and CBP20 (20kDa).
- The CBC binds to the 7−methylguanosine structure.
- It serves as a scaffold for factors involved in:
- Splicing.
- Transcription.
- Biogenesis of non-coding RNAs.
- Nuclear-cytoplasmic transport.
- First rounds of translation.
- Nonsense-Mediated Decay (NMD).
Splicing: The Removal of Introns
- Splicing is the process of removing non-coding regions (introns) and joining coding regions (exons).
- Analogy: Film splicing, where a machine cuts out a segment of film and joins the remaining ends with tape.
- Prevalence: Approximately 90% of eukaryotic genes undergo splicing.
- Gene Statistics:
- Average gene length: 5000 base pairs (bp), with extremes from 1000 to 10,000+bp.
- Exon length: Typically 500bp (ranging from 25 to 2000+bp).
- Intron length: Typically 2000bp, but can extend to 10,000+bp.
- The majority of the transcribed genome consists of non-coding introns.
The Chemical and Structural Mechanism of Splicing
- Recognition Sites:
- 5′ Donor Site: Usually begins with the dinucleotide GU.
- 3′ Acceptor Site: Usually ends with the dinucleotide AG, preceded by a polypyrimidine tract.
- Branch Point: An Adenine (A) residue located approximately 20 to 50 nucleotides upstream of the 3′ site.
- Chemical Reaction:
- Splicing involves two transesterification reactions (ΔG≈0).
- First reaction: The 2′-OH of the branch point Adenine attacks the 5′ splice site, breaking the 5′−3′ phosphodiester bond and creating a 5′−2′ bond.
- This forms a branched structure known as a "lariat" (cowboy lasso analogy).
- Second reaction: The freed 3′-OH of the upstream exon attacks the 5′ end of the downstream exon, joining them and releasing the lariat.
- Irreversibility: Although chemically reversible, the process is made one-way by mechanical rearrangements of the machinery that move the reaction ends away from each other after the attack.
The Spliceosome Machinery
- Composed of Small Nuclear Ribonucleoproteins (snRNPs, pronounced "snurps").
- BASAL MACHINERY:
- U1: Recognizes the 5′ donor site via RNA-RNA base pairing.
- U2: Recognizes the branch point site.
- U2AF (U2 Associated Factor): Assists U2 in recognizing the 3′ acceptor site.
- U4,U5,U6: Form a tri-snRNP complex that joins the assembly to catalyse the reaction.
- Note: U3 is not involved in mRNA splicing despite its name.
- snRNP Biogenesis: Specific RNAs are transcribed in the nucleus, exported to the cytoplasm for assembly with proteins (aided by chaperones), then imported back to the nucleus to Cajal bodies and finally to nuclear speckles.
Splicing Variations and Specialized Mechanisms
- Major vs. Minor Spliceosome:
- Major (U2-type): The standard machinery (U1,U2,U4,U5,U6) recognizing GU−AG sites.
- Minor (U12-type): Recognizes different consensus sequences (AT−AC) using different snRNPs (U11 for U1, U12 for U2, U4atac and U6atac for U4/U6). U5 is shared.
- The minor spliceosome may regulate processing rates for complex structural folding.
- Trans-splicing:
- Occurs in organisms like Trypanosomatids (Chagas, Sleep Sickness).
- These organisms transcribe polycistronic units (like operons) but require monocistronic mRNAs for translation.
- Specialized machinery joins a "Splice Leader" (mini-exon) to the 5′ end of each coding region in the polycistronic transcript.
- Forms a "Y" shaped intermediate instead of a lariat.
- Backsplicing and Circular RNAs:
- Occurs when a downstream 5′ splice site joins an upstream 3′ splice site.
- Results in a circular RNA molecule.
- Historically viewed as "noise," now known to have regulatory functions.
Alternative Splicing Regulation
- Alternative Splicing: Generating multiple mature mRNAs from a single primary transcript.
- Factors:
- SR Proteins: Rich in Serine (S) and Arginine (R). Usually bind to Exonic Splicing Enhancers (ESEs) to recruit basal machinery to weak sites.
- hnRNPs: Heterogeneous Nuclear Ribonucleoproteins. Often act as silencers, blocking the machinery via steric hindrance.
- Sequence Consensus:
- Splice sites are "consensus sequences," not rigid codes.
- Strong sites match the consensus well and bind machinery with high affinity.
- Weak sites differ from consensus and require auxiliary SR proteins for recognition.
- Kinetic/Processivity Model ("First-come, first-served"):
- RNA Pol II speed and chromatin state affect splicing.
- If Pol II moves fast, a strong downstream site may be transcribed before a weak upstream site is used, leading to exon skipping.
- If Pol II pauses (due to condensed chromatin), the machinery has time to assemble on a weak upstream site before the downstream competitor appears.
- Structural Secondary Model: RNA folding can hide or expose SR protein binding sites or splice sites.
- Biological Impact: The Dscam gene in Drosophila can produce 38,000 distinct protein isoforms through combinatory alternative splicing.
Cleavage and Polyadenylation
- Cleavage is distinct from the termination of transcription.
- The machinery identifies a consensus sequence (often AAUAAA) and a downstream GU-rich region.
- Proteins involved:
- CPSF (Cleavage and Polyadenylation Specificity Factor): Binds the consensus.
- CstF (Cleavage Stimulation Factor): Binds the GU-rich region.
- Cleavage Factors: Endonucleases that cut the RNA between the two sites.
- Poly-A Polymerase (PAP): Adds a tail of approximately 300 Adenine residues to the new 3′ end.
- Function of the Poly-A Tail:
- Protection from 3′ exonucleases.
- Circularization: Poly-A Binding Protein (PABP) interacts with Initiation Factor eIF4G/F, which binds the Cap Binding Complex (CBC).
- This loops the mRNA, bringing the terminating ribosome close to the start site for efficient re-initiation.
Transcription Termination: The Torpedo Model
- After the mRNA is cleaved for polyadenylation, the polymerase continues transcribing a waste RNA fragment with a free 5′ end.
- The lack of a cap on this fragment allows the exonuclease XRN2 to bind.
- XRN2 degrades the RNA faster than the polymerase moves, eventually reaching the polymerase and dislodging it from the DNA (the "Torpedo" effect).
- Non-polyadenylated RNAs (like snRNAs) use the Integrator complex to mediate cleavage and termination.
Nuclear Export and the Exon Junction Complex (EJC)
- The Exon Junction Complex (EJC) is a protein "mark" left by the spliceosome approximately 20−24 nucleotides upstream of every joined exon-exon junction.
- The TREX complex (Transcription-Export complex) recognizes both the Cap Binding Complex (CBC) and the first EJC.
- This ensures the RNA is exported from the nucleus 5′ end first.
- RNAs without a cap or EJCs (like some ncRNAs) remain in the nucleus.
- Pioneer Round of Translation: The first time a ribosome scans an mRNA, it acts like "shelling corn" (desgranando un elote), stripping off EJCs and other nuclear proteins.
- If the ribosome encounters a STOP codon but an EJC remains downstream (further toward the 3′ end), the cell identifies this as a Premature Termination Codon (PTC).
- Mechanism:
- The stalling ribosome forms the SURF complex (SMG1, Upf1, eRF1, eRF3).
- If the SURF complex contacts a downstream EJC, it triggers phosphorylation and the recruitment of SMG proteins.
- SMG proteins are degradative: they decapping enzymes, deadenylases, and endonucleases that destroy the aberrant mRNA.
Competition and Regulation by ncRNAs (ceRNA Hypothesis)
- microRNAs (miRNAs): Small RNAs that bind to the 3′-UTR of mRNAs via the RISC complex to inhibit translation or trigger cleavage.
- Competing Endogenous RNA (ceRNA) / "Musical Chairs" Analogy:
- Various RNAs (Long non-coding RNAs, Circular RNAs, or other mRNAs) compete for a finite pool of specific microRNAs.
- If a highly expressed lncRNA acts as a "sponge" and sequester all the miRNAs, the target mRNA is freed from regulation and its protein levels increase.
- Circular RNAs are particularly effective sponges due to their stability (lack of ends prevents exonuclease decay).