RNA Splicing - Key Concepts (Lecture 2)

The structure of eukaryotic genes

  • Most eukaryotic genes contain alternating coding sequences (exons) and non-coding intervening sequences (introns).

  • In all cases, the order of exons is maintained in the mature mRNA; introns are removed.

  • Relationship between introns and exons: N<em>intron=N</em>exon1.N<em>{intron} = N</em>{exon} - 1.

  • Gene/mRNA size examples (from figures): gene length roughly 7700 bp7700\ \text{bp}; mature transcript length roughly 1872 nt.1872\ \text{nt}.

  • Collinearity: in eukaryotes, gene coding sequence is not simply a continuous DNA-to-protein map (unlike many prokaryotes). Prokaryotes generally show collinearity between gene and protein sequence.


Introns

  • Noncoding sequences present in most eukaryotic genes; also found in archaea and some bacteria.

  • Occur in mRNA, tRNA, and rRNA transcripts.

  • Types and sizes vary widely: 0Nintron60introns0\leq N_{intron} \leq 60\,\text{introns}; intron sizes range from <200\ \text{nt} to >50{,}000\ \text{nt}.

  • Also observed in mitochondrial, chloroplast genomes; introns are intervening sequences.


Types of introns and splicing mechanisms

  • Group I introns: self-splicing; found in bacteria, bacteriophages, and some eukaryotes.

  • Group II introns: self-splicing; found in bacteria, archaea, and eukaryotic organelles.

  • Nuclear pre-mRNA introns: spliceosomal introns; processed by the nuclear spliceosome in eukaryotes.

  • tRNA introns: enzymatic (non-spliceosomal) introns; processed differently.


Spliceosome and splicing machinery

  • Splicing is catalysed by the spliceosome: a large complex made of five small nuclear ribonucleoproteins (snRNPs): U1, U2, U4, U5, U6, plus associated proteins.

  • Core components: snRNPs with small nuclear RNAs (snRNAs) and proteins.

  • Key roles of snRNPs:

    • U1 snRNP recognizes and binds the 5' splice site (canonical sequence at the intron start).

    • U2 snRNP recognizes and binds the branch point and helps position it for splicing.

    • aa U1 and U4 are released during activation.

  • U1 snRNP/U2 snRNP interaction is coupled to RNA Pol II transcription; U1 attachment occurs soon after intron transcription.


The splicing code and signals

  • Splicing signals (the splicing code) include:

    • 5' splice site with the GU motif (most conserved at the 5' end of the intron): 5-splice site=GU5'\text{-splice site} = GU

    • 3' splice site with the AG motif (most conserved at the 3' end of the intron): 3-splice site=AG3'\text{-splice site} = AG

    • Branch point: a conserved A nucleotide within the intron that forms a lariat during splicing: branch point=A\text{branch point} = A

  • The combination of these signals directs precise intron removal and exon joining.


The splicing process (mechanistic overview)

  • Initial cut at the 5' splice site; the 5' end of the intron joins the branch point to form a lariat structure.

  • A second cut at the 3' splice site; exons are joined, intron released as a lariat.

  • The lariat is debranched and degraded by nuclear enzymes.

  • Spliceosome-mediated steps are shown in the canonical cycle with U1 recruiting, U2 positioning, snRNP rearrangements, and exon ligation.


The timing of RNA splicing

  • Splicing can be co-transcriptional or post-transcriptional:

    • Co-transcriptional splicing occurs while transcription is ongoing.

    • Post-transcriptional splicing occurs after transcription completes.

  • Concurrent processing events include 5' capping and 3' polyadenylation around the transcription window.

  • Overall processing flow: 5' capping → splicing → 3' cleavage and polyadenylation → nuclear export → translation; degradation pathways also exist.


Splicing signals and maturation code (summary)

  • Splicing signals include:

    • 5' splice site (GU) and 3' splice site (AG)

    • Branch point (A)

  • The splicing code is defined by these core motifs; mutations can disrupt splicing and cause disease.


Mutations affecting normal splicing (consequences)

  • Mutations can lead to:

    • Exon skipping (loss of an exon)

    • Intron retention (intron remains in mature mRNA)

    • Pseudoexon activation (cryptic exons become included)

    • Partial exon skipping/retention leading to frameshifts or in-frame changes

  • Example: dystrophin gene splicing defects linked to Duchenne Muscular Dystrophy.


Additional notes (context)

  • The spliceosome is a dynamic assembly that coordinates with RNA polymerase II during transcription.

  • Splicing is essential to maintain reading frame and proper protein sequence; errors can have severe biological consequences.

  • The major intron types relevant to this module are the spliceosomal introns in nuclear pre-mRNA and the self-splicing introns (Groups I and II).


Quick references to key terms

  • Exon: coding sequence retained in mature mRNA

  • Intron: noncoding intervening sequence removed by splicing

  • Spliceosome: snRNP-based complex (U1, U2, U4, U5, U6) that catalyzes splicing

  • Lariat: looped intron structure formed during splicing

  • 5' cap: modified nucleotide at the 5' end (m7G)

  • Poly(A) tail: ~polyadenylation at the 3' end (AAA… tail)

  • Collinearity: relationship between gene layout and protein sequence; less direct in eukaryotes than prokaryotes