RNA Splicing - Key Concepts (Lecture 2)
The structure of eukaryotic genes
Most eukaryotic genes contain alternating coding sequences (exons) and non-coding intervening sequences (introns).
In all cases, the order of exons is maintained in the mature mRNA; introns are removed.
Relationship between introns and exons:
Gene/mRNA size examples (from figures): gene length roughly ; mature transcript length roughly
Collinearity: in eukaryotes, gene coding sequence is not simply a continuous DNA-to-protein map (unlike many prokaryotes). Prokaryotes generally show collinearity between gene and protein sequence.
Introns
Noncoding sequences present in most eukaryotic genes; also found in archaea and some bacteria.
Occur in mRNA, tRNA, and rRNA transcripts.
Types and sizes vary widely: ; intron sizes range from <200\ \text{nt} to >50{,}000\ \text{nt}.
Also observed in mitochondrial, chloroplast genomes; introns are intervening sequences.
Types of introns and splicing mechanisms
Group I introns: self-splicing; found in bacteria, bacteriophages, and some eukaryotes.
Group II introns: self-splicing; found in bacteria, archaea, and eukaryotic organelles.
Nuclear pre-mRNA introns: spliceosomal introns; processed by the nuclear spliceosome in eukaryotes.
tRNA introns: enzymatic (non-spliceosomal) introns; processed differently.
Spliceosome and splicing machinery
Splicing is catalysed by the spliceosome: a large complex made of five small nuclear ribonucleoproteins (snRNPs): U1, U2, U4, U5, U6, plus associated proteins.
Core components: snRNPs with small nuclear RNAs (snRNAs) and proteins.
Key roles of snRNPs:
U1 snRNP recognizes and binds the 5' splice site (canonical sequence at the intron start).
U2 snRNP recognizes and binds the branch point and helps position it for splicing.
aa U1 and U4 are released during activation.
U1 snRNP/U2 snRNP interaction is coupled to RNA Pol II transcription; U1 attachment occurs soon after intron transcription.
The splicing code and signals
Splicing signals (the splicing code) include:
5' splice site with the GU motif (most conserved at the 5' end of the intron):
3' splice site with the AG motif (most conserved at the 3' end of the intron):
Branch point: a conserved A nucleotide within the intron that forms a lariat during splicing:
The combination of these signals directs precise intron removal and exon joining.
The splicing process (mechanistic overview)
Initial cut at the 5' splice site; the 5' end of the intron joins the branch point to form a lariat structure.
A second cut at the 3' splice site; exons are joined, intron released as a lariat.
The lariat is debranched and degraded by nuclear enzymes.
Spliceosome-mediated steps are shown in the canonical cycle with U1 recruiting, U2 positioning, snRNP rearrangements, and exon ligation.
The timing of RNA splicing
Splicing can be co-transcriptional or post-transcriptional:
Co-transcriptional splicing occurs while transcription is ongoing.
Post-transcriptional splicing occurs after transcription completes.
Concurrent processing events include 5' capping and 3' polyadenylation around the transcription window.
Overall processing flow: 5' capping → splicing → 3' cleavage and polyadenylation → nuclear export → translation; degradation pathways also exist.
Splicing signals and maturation code (summary)
Splicing signals include:
5' splice site (GU) and 3' splice site (AG)
Branch point (A)
The splicing code is defined by these core motifs; mutations can disrupt splicing and cause disease.
Mutations affecting normal splicing (consequences)
Mutations can lead to:
Exon skipping (loss of an exon)
Intron retention (intron remains in mature mRNA)
Pseudoexon activation (cryptic exons become included)
Partial exon skipping/retention leading to frameshifts or in-frame changes
Example: dystrophin gene splicing defects linked to Duchenne Muscular Dystrophy.
Additional notes (context)
The spliceosome is a dynamic assembly that coordinates with RNA polymerase II during transcription.
Splicing is essential to maintain reading frame and proper protein sequence; errors can have severe biological consequences.
The major intron types relevant to this module are the spliceosomal introns in nuclear pre-mRNA and the self-splicing introns (Groups I and II).
Quick references to key terms
Exon: coding sequence retained in mature mRNA
Intron: noncoding intervening sequence removed by splicing
Spliceosome: snRNP-based complex (U1, U2, U4, U5, U6) that catalyzes splicing
Lariat: looped intron structure formed during splicing
5' cap: modified nucleotide at the 5' end (m7G)
Poly(A) tail: ~polyadenylation at the 3' end (AAA… tail)
Collinearity: relationship between gene layout and protein sequence; less direct in eukaryotes than prokaryotes