Week 9 A - RNA Processing in Eukaryotes
RNA Processing in Eukaryotes and Transcriptome Analysis
RNA Processing in Eukaryotes
Most eukaryotic genes and their proteins are not collinear.
Most eukaryotic genes have coding DNA sequences, exons, that are interrupted by noncoding elements, introns.
All introns and exons are transcribed initially into a pre-RNA, but introns are removed from mature RNA.
Introns are also present (albeit much fewer) in bacterial, Archaea, virus genome as well as in mitochondria and chloroplast DNA.
RNA is processed post-transcriptionally to generate mature functional RNA.
Types of RNA Processing
RNA processing refers to any modification made to RNA between its transcription and its final function in the cell. All three RNA types are processed in the eukaryotic nucleus. This includes:
Removing nucleotides or adding nucleotides, cutting RNA to generate mature RNA ends:
mRNA polyA tail added
tRNA CCA sequence added at the 3′ end
rRNA: endonuclease cleaves rRNA, exonuclease removes nucleotides to release different rRNA species
Covalent modification:
methylated nucleotides in mRNA, tRNA or rRNA
Splicing: removing internal sequences and joining the flanking pieces together
pre-mRNA, pre-tRNA
RNA Processing and Human Health
Many genetic variants are linked to disease in GWAS but lack demonstrated functional effects
Examples of diseases and their links to RNA processing:
Crohn disease: SNV in IRGM1 disrupts a miR-196-binding site
Type 2 diabetes: SNV in CDKAL1 alters splicing, with effects on miRNA-mediated regulation
Coronary artery disease: SNV in OLR1 disrupts a miR-24-binding site
Cystic fibrosis: SNV in CFTR alters splicing, removing a premature termination codon; SNV in CFTR creates a variably spliced cryptic exon
Heroin addiction: SNV in OPRM1 alters binding site of splicing factor HNRNPH1
Response to 5' fluorouracil and irinotecan: SNV in the pri-miR26a-1 locus alters miRNA expression
Prognosis in non-small-cell lung cancer: SNV in ICAM1 alters splicing to favour the soluble isoform
Prognosis in ependymoma: SNV in ERCC5 creates an upstream open reading frame that alters translation
Acute intermittent porphyria: SNV in HMBS causes exon skipping
Charcot-Marie-Tooth disease: SNV in MPZ causes exon skipping
Recessive bone fragility disorder: SNV in BMP1 alters the consensus polyadenylation signal
RNA Processing - Outline
tRNA processing
rRNA processing
mRNA processing and export regulation
Nonsense mediated decay
Alternative splicing
tRNA Processing
RNA polymerase III terminates transcription in a poly(U)4 (terminator) sequence.
tRNA splicing occurs by successive cleavage and ligation reactions.
tRNA processing involves:
Removal of leader sequence
Replacement of nucleotides with CCA at the 3' end
Chemical modification of bases
Excision of intron (purple region of loop)
An endonuclease cleaves the tRNA precursors at both ends of the intron.
Release of the intron generates two half-tRNAs with unusual ends: 5′ hydroxyl and 2′–3′ cyclic phosphate tRNA splicing.
The 5′–OH end is phosphorylated by a polynucleotide kinase.
The cyclic phosphate group is opened by phosphodiesterase to generate a 2′–phosphate terminus and 3′–OH group.
The exon ends are joined by an RNA ligase (3’-OH with 5’ PO4).
The 2′–phosphate is removed by a phosphatase.
Folded Secondary and 3D structure of tRNA
tRNA has a folded secondary and 3D structure with key regions like the Amino Acid Acceptor Stem, Anticodon Loop, and Variable loop.
rRNA Processing
RNA Pol I terminates at a major terminator located downstream from the rRNA precursor sequence and requires terminator recognition by specific protein factors (Reb1 in yeast; TTF1 in mammals).
Endonuclease cleaves RNA 15-50 b from 3’ end of 28S rRNA.
RNA polymerase I generates rRNA form a gene cluster regulated by a single promoter.
External transcribed spacer (ETS)
Internal transcribed spacer (ITS)
Non-transcribed spacer – The region between transcription units in a tandem gene cluster (intergenic sequence IGS)
Production of rRNA requires a series of cleavage (endonucleases) and trimming (exonucleases) reactions.
rRNA is cotranscriptionally processed.
Cleavage events and involves small nucleolar RNAs (snoRNAs).
Large and small rRNAs are released by cleavage from a common precursor rRNA (the yeast 5S rRNA is separately transcribed by Pol III).
The pre-rRNA is extensively modified during processing by:
methylation of the 2′-hydroxyl group of specific riboses
conversion of specific uridine residues to pseudouridine
The snoRNA base pairs with a sequence of rRNA that contains the target base to be methylated.
mRNA Processing
All pre-mRNAs of protein-coding genes undergo a basal level of RNA processing that requires cis-acting sequences that are recognized by the appropriate processing machinery.
mRNA processing includes:
5' capping
3' polyA tail
Splicing
Alternative splicing
Alternative polyadenylation
mRNA Splicing
The removal of non-coding sequences, introns, from the mRNA precursors is an essential step in eukaryotic gene expression.
Nuclear splice sites are short sequences immediately surrounding the exon–intron boundaries. They are named for their positions relative to the intron.
The 5′ splice site at the 5′ (left) end of the intron includes the consensus sequence GU for most introns.
The 3′ splice site at the 3′ (right) end of the intron includes the consensus sequence AG for most introns.
U2-type/Major introns
The GU-AG rule describes the requirement for these constant dinucleotides at the first two and last two positions of introns in pre-mRNAs.
Comparing the two intron types
Different sequence conservation at different regions.
Splicing signals for minor (U12-type or AU-AC) introns.
Splicing signals for major (U2-type or GU-AG) introns.
Splicing requires the 5′ and 3′ splice sites and a branch site just upstream of the 3′ splice site.
Splicing depends on recognition of pairs of splice sites.
All 5′ splice sites are functionally equivalent, and all 3′ splice sites are functionally equivalent.
Additional conserved sequences at both 5′ and 3′ splice sites define functional splice sites among numerous other potential sites in the pre-mRNA (e.g. branching site).
Pre-mRNA splicing proceeds through a lariat
The branch sequence is conserved in yeast but less well conserved in multicellular eukaryotes.
A lariat is formed when the intron is cleaved at the 5′ splice site, and the 5′ end is joined to a 2′ position at an “A” at the branch site in the intron.
Splicing steps:
The 5’ exon is cleaved off
Lariat is formed by joining 5’ intron end to branch site
3’ splice site is cut and the 5’ exon is joined to the 3’ exon
Intron is released, debranched and degraded
The splicing machinery
Splicing requires assembly of the spliceosome:
small nuclear RNAs (snRNA): Small RNA species in the nucleus, several are involved in splicing or other RNA processing reactions (U1, U2, U4, U5, U6)
Proteins, that make complexes with snRNA; they are called Small nuclear ribonucleoproteins (snRNPs)
Splicing factors: A protein component of the spliceosome that is not part of one of the snRNPs
snRNAs in splicing
The snRNPs involved in splicing are U1, U2, U5, U4, and U6. They are named according to the snRNAs that are present in them.
Each snRNP contains one snRNA and < 20 proteins
Splicing phases
Phase I: Commitment of pre-mRNA to splicing
Phase II: Spliceosome assembly
Phase III: Transesterification
Recognition of splice sites
Intron definition
Exon definition
Switch from interactions across the exon to those across the intron
Spliceosome assembly
U1 base-pairs with the 5’ splice site in E1 complex.
Addition of U2 to the 3’ splice site and pyrimidine tract form the pre-spliceosome (A) complex.
Recruitment of U5 and U4/U6 snRNPs converts to the mature spliceosome (B1 complex).
The B1 complex is converted to a B2 complex by the release of U1 snRNP which allows U6 snRNP to interact with the 5′ splice site.
Transesterification
Transesterification – A reaction that breaks and makes chemical bonds in a coordinated transfer so that no energy is required
Phosphodiester bond: the bond between the phosphate group and the sugar in a polynucleotide chain
2' hydroxyl group of the conserved adenosine within the branching site attacks the conserved guanine of the 5' splicing site (2'-5' phosphodiester bond)
3'-OH end of exon1 then attacks the phosphodiester bond of the conserved guanine of the 3' splicing site
Pre-mRNA splicing summary
Schematic representation of the splicing process.
Splicing is coupled to transcription
Splicing can occur during or after transcription.
The transcription and splicing machineries are physically and functionally integrated.
Splicing is connected to mRNA export and stability control.
Exon junction complex
Exon junction complex (EJC) assembles at exon–exon junctions during splicing and assists in RNA transport, localization, and degradation.
EJC recruits RNA binding proteins involved in mRNA transport.
REF protein binds to a splicing factor and remains with the spliced RNA product.
mRNA export regulation
transcription-export complex (TREX) is recruited to the nascent mRNA by the splicing machinery.
conserved nuclear RNA export factor 1 (NXF1/Tap/Mex) is recruited to the mature mRNA through direct interactions with several TREX components and SR splicing factors.
Cargo mRNAs from both TREX are transferred to NXF1 and its cofactor p15 for transit through the nuclear pore by interacting directly with the nucleoporins that line the pore
Nonsense-mediated decay (NMD)
A pathway that degrades an mRNA that has a nonsense mutation prior to the last exon.
EJC is removed form spliced mRNA in cytoplasm by ribosome during translation.
Ribosome falls off the mRNA if premature STOP codon is introduced.
EJC stays on spliced mRNA with nonsense mutation.
Recruitment of Upf, which recruits decapping enzyme (DCP).
DCP removes 5’ 7MeGTP cap leading to rapid degradation of mRNA.
NMD - surveillance for premature stop codons
There is no NMD if:
the resulting premature termination codon is in the last exon, or
the resulting premature termination codon is in the last 50 nucleotides in the second to last exon
Mutations affecting RNA processing
Genetic variations affecting RNA processing can alter gene expression
Alternative splicing
Production of a different end RNA product from a single (pre-mRNA) product by changes in the usage of splicing junctions.
Alternative splicing is a rule, rather than an exception, in multicellular eukaryotes (~90%of human genes)
Alternative splicing contributes to structural and functional diversity of gene products.
Different modes of alternative splicing:
Intron retention
Mutually exclusive exons
Alternative 5' splice sites
Combinatorial exon selection
Alternative 3' splice sites
Alternative promoter/splicing
Exon inclusion/skipping
Alternative polyadenylation/splicing
Key regulators of alternative splicing
RNA-binding proteins (RBPs) regulate alternative splicing through their expression level, intracellular localization, activity and, in some cases, their own alternative splicing.
RBPs promote or inhibit the recognition of alternative regions by the spliceosome machinery.
RBPs can act in a cooperative or competitive manner.
Splicing enhancers/silencers
Alternative splicing is often associated with weak splice sites (splicing signals at both ends of the intron diverge from consensus).
Specific exonic and intronic sequences can enhance or suppress splice site selection (splicing enhancers or splicing suppressors).
Effect of splicing enhancers and silencers is mediated by sequence- specific RBPs, many of which are developmentally regulated and/or expressed in a tissue-specific manner
SR proteins bind to exonic enhancers (ESEs)
hnRNPs A/B binds to exonic silencers (ESS)
RBPs regognizes intronic silencers and enhancers (ISS and ISE)
Alternative splicing of α- tropomyosin
Tropomyosins differ in their recruitment of myosin motors and their interaction with actin filament regulators
RNA mis-splicing in disease
Examples include Limb girdle muscular dystrophy 1B (LGMD1B), Familial partial lipodystrophy type 2 (FPLD2), Hutchinson-Gilford progeria syndrome (HGPS), and Dilated cardiomyopathy (DCM), each caused by specific mutations leading to mis-splicing and altered protein products.
Summary - RNA processing
Types
tRNA, rRNA and mRNA processing
Nonsense mediated decay for degradation of RNA with nonsense mutation
Mechanisms of alternative splicing and its impact on protein function and disease