Transcription and mRNA processing
Transcription
Stages of Transcription
- Transcription consists of three main stages:
- Initiation
- Elongation
- Termination
- Initiation is the most important stage for the regulation of gene expression.
I. Initiation
- RNA polymerase binds to the promoter, which is the region of the gene that:
- Contains the start site of transcription.
- Regulates the efficiency of transcription.
- Promoter DNA sequences are adjacent to the DNA molecules they regulate. This is an example of cis-acting elements (from the Latin word cis, meaning "on the same side as”).
- Cis-acting DNA promoter elements serve as attachment sites for DNA binding proteins that regulate initiation of transcription.
- In eukaryotes:
- Transcription factors recognize and bind specific DNA sequences within the promoter region (e.g., a TATA box).
- Bound transcription factors recruit RNA polymerase to create a transcription initiation complex.
- Once the initiation complex is assembled, RNA polymerase starts transcription.
- In prokaryotes, RNA polymerase recognizes and binds directly to the promoter.
- In eukaryotes, RNA polymerase requires additional transcription factors for its recruitment to a promoter.
General Pol II Transcription Factors
- Many proteins are required for transcription at the RNA Polymerase II promoters of eukaryotes, including:
- RNA polymerase II (12 subunits, M ranging from 10,000-220,000): Catalyzes RNA synthesis.
- TBP (TATA-binding protein) (1 subunit, M = 38,000): Specifically recognizes the TATA box.
- TFIIA (3 subunits, M = 12,000, 19,000, 35,000): Stabilizes binding of TFIIB and TBP to the promoter.
- TFIIB (1 subunit, M = 35,000): Binds to TBP and recruits RNA polymerase-TFIIF complex.
- TFIID (12 subunits, M ranging from 15,000-250,000): Interacts with positive and negative regulatory proteins.
- TFIIE (2 subunits, M = 34,000, 57,000): Recruits TFIIH; ATPase and helicase activities.
- TFIIF (2 subunits, M = 30,000, 74,000): Binds tightly to RNA polymerase II; binds to TFIIB and prevents binding of RNA polymerase to nonspecific DNA sequences.
- TFIIH (12 subunits, M ranging from 35,000-89,000): Unwinds DNA at promoter; phosphorylates RNA polymerase; recruits nucleotide-excision repair.
- The promoter determines the rate of transcription initiation.
- Promoter “strength” can be weak versus strong.
- A strong promoter refers to a high rate of transcription initiation (i.e., higher affinity of RNA polymerase for the promoter).
- Strength is mainly determined by how well RNA polymerase/transcription factors bind to the promoter nucleotide sequence.
II. Transcription Elongation
- RNA synthesis does not require a primer and proceeds from the 5' to the 3' direction.
- As RNA polymerase moves along the DNA, it untwists the double helix to produce single-stranded template DNA.
- RNA polymerase adds nucleotides to the 3’ end of the growing strand, proceeding 5’ to 3’.
- Behind the point of RNA synthesis, the double helix re-forms, and the RNA molecule peels away.
- Multiple polymerase molecules can simultaneously transcribe a single gene, which increases the amount of mRNA transcribed and can amplify gene expression.
- This allows the cell to make the encoded protein in large amounts.
Transcription Termination
- Transcription proceeds until RNA polymerase encounters a terminator sequence in the DNA.
- In prokaryotes, RNA polymerase stops transcription right at the end of the terminator. Both the RNA and DNA are released.
- In eukaryotes, the polymerase continues for hundreds of nucleotides past the “polyadenylation” sequence.
- 10 to 35 nucleotides past this sequence, pre-mRNA is cut and released from the transcription complex.
Eukaryotic mRNA Processing (Post-transcriptional Modifications)
- Enzymes in the eukaryotic nucleus modify pre-mRNA before it is transported from the nucleus to the cytoplasm.
- 5’ mG cap
- 3’ poly A tail
- Splicing
5’ cap
- A modified form of guanine, the 5’ methyl guanosine cap (5’mG cap), is added to the 5’ end of the pre-mRNA molecule.
- The 5’mG cap helps protect mRNA from exonucleases and functions as an “attach here” signal for ribosomes.
3’ poly A tail
- At the 3’ end, an enzyme adds 50 to 250 adenine nucleotides, forming the poly(A) tail.
- In addition to protecting from nucleases, the poly(A) tail also facilitates the export of mRNA from the nucleus.
RNA splicing
- Splicing is the removal of noncoding segments, introns, which lie between coding segments, exons, and the subsequent splicing together of the exons.
- The final mRNA transcript contains exons that are spliced together and translated into amino acid sequences.
- Removal of intervening sequences (introns) from RNA occurs in the nucleus before transport to the cytoplasm.
- RNA splicing removes introns and joins exons (expressed sequences). Thus, spliced mRNAs are not co-linear with their template DNA.
- Introns are segments of RNA that are not translated.
- The discovery of this process and the corresponding realization that many genes are split by non-coding intervening sequences is one of the most important discoveries in molecular genetics in the last 50 years.
- Splicing is accomplished by a spliceosome, which is a ribonucleoprotein complex consisting of protein and several RNAs (small nuclear RNA (snRNA) = snRNP).
- Within the spliceosome, snRNA base-pairs with nucleotides at the ends of the intron.
- The RNA transcript is cut to release the intron, and the exons are spliced together. The spliceosome then comes apart, releasing mRNA, which now contains only exons.
- The splicing reaction is catalyzed by RNA.
Functions of RNA splicing
- Splicing may regulate the passage of mRNA from the nucleus to the cytoplasm.
- Splicing enables one gene to encode for more than one polypeptide.
- Alternative RNA splicing gives rise to two or more different polypeptides, depending on which segments are treated as exons.
- Alternative splicing of pre-mRNA to yield different mRNAs generates protein diversity.
- Split genes may also facilitate the evolution of new proteins.
- Proteins often have a modular architecture with discrete structural and functional domains.
- In some cases, different exons code for different domains of a protein that can be “shuffled.”
- Protein diversity can be generated by alternative splicing.
- A pre-mRNA with multiple exons is sometimes spliced in different ways to expand the forms of proteins to meet functional requirements of specialized cells.
- There are about 25,000 genes in the human genome, but at least 100,000 different proteins. This difference is derived in part from alternative splicing.
- Alternative splicing allows for a greater variety of proteins than could be predicted by just the number of genes in the genome.
EUKARYOTIC mRNA PROCESSING
- UTR = untranslated region
- The final mRNA may be smaller than pre-mRNA and is often NOT colinear with DNA that encodes it.
- Prokaryotic and eukaryotic transcription and translation differ spatially because of the nucleus.
Polycistronic mRNA in prokaryotes
- An mRNA that codes for more than one protein is commonly found in bacteria but not in eukaryotes.
- This set of proteins might be enzymes of a specific metabolic pathway.
Functions of DNA
- DNA has four important functions:
- Genetic material stores genetic information—millions of nucleotides; base sequence encodes huge amounts of information.
- Genetic material is susceptible to mutation—a change in information—possibly a simple alteration to a sequence.
- Genetic material is precisely replicated in cell division—by complementary base pairing.
- Genetic material is expressed as the phenotype—nucleotide sequence determines sequence of RNA and amino acids in polypeptides.
Homework Questions
- To determine the template strand for transcription, identify the strand complementary to the RNA transcript.
- The minimum length of a gene that codes for a polypeptide that is 100 amino acids long is 300 nucleotides (100 amino acids x 3 nucleotides per codon).
- Given a template strand of DNA with the sequence 5’ AGT 3’, the corresponding codon on the mRNA is 5’ ACU 3’.
- Assuming the promoter sequence is upstream of a double-stranded DNA, to find the sequence of the encoded RNA, determine the sequence complementary to the template strand, replacing T with U (Uracil).