Transcription and mRNA processing

Transcription

Stages of Transcription

  • Transcription consists of three main stages:
    1. Initiation
    2. Elongation
    3. Termination
  • Initiation is the most important stage for the regulation of gene expression.

I. Initiation

  • RNA polymerase binds to the promoter, which is the region of the gene that:
    • Contains the start site of transcription.
    • Regulates the efficiency of transcription.
  • Promoter DNA sequences are adjacent to the DNA molecules they regulate. This is an example of cis-acting elements (from the Latin word cis, meaning "on the same side as”).
  • Cis-acting DNA promoter elements serve as attachment sites for DNA binding proteins that regulate initiation of transcription.
  • In eukaryotes:
    • Transcription factors recognize and bind specific DNA sequences within the promoter region (e.g., a TATA box).
    • Bound transcription factors recruit RNA polymerase to create a transcription initiation complex.
    • Once the initiation complex is assembled, RNA polymerase starts transcription.
  • In prokaryotes, RNA polymerase recognizes and binds directly to the promoter.
  • In eukaryotes, RNA polymerase requires additional transcription factors for its recruitment to a promoter.

General Pol II Transcription Factors

  • Many proteins are required for transcription at the RNA Polymerase II promoters of eukaryotes, including:
    • RNA polymerase II (12 subunits, M ranging from 10,000-220,000): Catalyzes RNA synthesis.
    • TBP (TATA-binding protein) (1 subunit, M = 38,000): Specifically recognizes the TATA box.
    • TFIIA (3 subunits, M = 12,000, 19,000, 35,000): Stabilizes binding of TFIIB and TBP to the promoter.
    • TFIIB (1 subunit, M = 35,000): Binds to TBP and recruits RNA polymerase-TFIIF complex.
    • TFIID (12 subunits, M ranging from 15,000-250,000): Interacts with positive and negative regulatory proteins.
    • TFIIE (2 subunits, M = 34,000, 57,000): Recruits TFIIH; ATPase and helicase activities.
    • TFIIF (2 subunits, M = 30,000, 74,000): Binds tightly to RNA polymerase II; binds to TFIIB and prevents binding of RNA polymerase to nonspecific DNA sequences.
    • TFIIH (12 subunits, M ranging from 35,000-89,000): Unwinds DNA at promoter; phosphorylates RNA polymerase; recruits nucleotide-excision repair.

Promoter Strength

  • The promoter determines the rate of transcription initiation.
  • Promoter “strength” can be weak versus strong.
  • A strong promoter refers to a high rate of transcription initiation (i.e., higher affinity of RNA polymerase for the promoter).
  • Strength is mainly determined by how well RNA polymerase/transcription factors bind to the promoter nucleotide sequence.

II. Transcription Elongation

  • RNA synthesis does not require a primer and proceeds from the 5' to the 3' direction.
  • As RNA polymerase moves along the DNA, it untwists the double helix to produce single-stranded template DNA.
  • RNA polymerase adds nucleotides to the 3’ end of the growing strand, proceeding 5’ to 3’.
  • Behind the point of RNA synthesis, the double helix re-forms, and the RNA molecule peels away.
  • Multiple polymerase molecules can simultaneously transcribe a single gene, which increases the amount of mRNA transcribed and can amplify gene expression.
  • This allows the cell to make the encoded protein in large amounts.

Transcription Termination

  • Transcription proceeds until RNA polymerase encounters a terminator sequence in the DNA.
  • In prokaryotes, RNA polymerase stops transcription right at the end of the terminator. Both the RNA and DNA are released.
  • In eukaryotes, the polymerase continues for hundreds of nucleotides past the “polyadenylation” sequence.
  • 10 to 35 nucleotides past this sequence, pre-mRNA is cut and released from the transcription complex.

Eukaryotic mRNA Processing (Post-transcriptional Modifications)

  • Enzymes in the eukaryotic nucleus modify pre-mRNA before it is transported from the nucleus to the cytoplasm.
    1. 5’ mG cap
    2. 3’ poly A tail
    3. Splicing

5’ cap

  • A modified form of guanine, the 5’ methyl guanosine cap (5’mG cap), is added to the 5’ end of the pre-mRNA molecule.
  • The 5’mG cap helps protect mRNA from exonucleases and functions as an “attach here” signal for ribosomes.

3’ poly A tail

  • At the 3’ end, an enzyme adds 50 to 250 adenine nucleotides, forming the poly(A) tail.
  • In addition to protecting from nucleases, the poly(A) tail also facilitates the export of mRNA from the nucleus.

RNA splicing

  • Splicing is the removal of noncoding segments, introns, which lie between coding segments, exons, and the subsequent splicing together of the exons.
  • The final mRNA transcript contains exons that are spliced together and translated into amino acid sequences.
  • Removal of intervening sequences (introns) from RNA occurs in the nucleus before transport to the cytoplasm.
  • RNA splicing removes introns and joins exons (expressed sequences). Thus, spliced mRNAs are not co-linear with their template DNA.
  • Introns are segments of RNA that are not translated.
  • The discovery of this process and the corresponding realization that many genes are split by non-coding intervening sequences is one of the most important discoveries in molecular genetics in the last 50 years.
  • Splicing is accomplished by a spliceosome, which is a ribonucleoprotein complex consisting of protein and several RNAs (small nuclear RNA (snRNA) = snRNP).
  • Within the spliceosome, snRNA base-pairs with nucleotides at the ends of the intron.
  • The RNA transcript is cut to release the intron, and the exons are spliced together. The spliceosome then comes apart, releasing mRNA, which now contains only exons.
  • The splicing reaction is catalyzed by RNA.

Functions of RNA splicing

  • Splicing may regulate the passage of mRNA from the nucleus to the cytoplasm.
  • Splicing enables one gene to encode for more than one polypeptide.
  • Alternative RNA splicing gives rise to two or more different polypeptides, depending on which segments are treated as exons.
  • Alternative splicing of pre-mRNA to yield different mRNAs generates protein diversity.
  • Split genes may also facilitate the evolution of new proteins.
  • Proteins often have a modular architecture with discrete structural and functional domains.
  • In some cases, different exons code for different domains of a protein that can be “shuffled.”
  • Protein diversity can be generated by alternative splicing.
  • A pre-mRNA with multiple exons is sometimes spliced in different ways to expand the forms of proteins to meet functional requirements of specialized cells.
  • There are about 25,000 genes in the human genome, but at least 100,000 different proteins. This difference is derived in part from alternative splicing.
  • Alternative splicing allows for a greater variety of proteins than could be predicted by just the number of genes in the genome.

EUKARYOTIC mRNA PROCESSING

  • UTR = untranslated region
  • The final mRNA may be smaller than pre-mRNA and is often NOT colinear with DNA that encodes it.

Flow of information

  • Prokaryotic and eukaryotic transcription and translation differ spatially because of the nucleus.

Polycistronic mRNA in prokaryotes

  • An mRNA that codes for more than one protein is commonly found in bacteria but not in eukaryotes.
  • This set of proteins might be enzymes of a specific metabolic pathway.

Functions of DNA

  • DNA has four important functions:
    1. Genetic material stores genetic information—millions of nucleotides; base sequence encodes huge amounts of information.
    2. Genetic material is susceptible to mutation—a change in information—possibly a simple alteration to a sequence.
    3. Genetic material is precisely replicated in cell division—by complementary base pairing.
    4. Genetic material is expressed as the phenotype—nucleotide sequence determines sequence of RNA and amino acids in polypeptides.

Homework Questions

  • To determine the template strand for transcription, identify the strand complementary to the RNA transcript.
  • The minimum length of a gene that codes for a polypeptide that is 100 amino acids long is 300 nucleotides (100 amino acids x 3 nucleotides per codon).
  • Given a template strand of DNA with the sequence 5’ AGT 3’, the corresponding codon on the mRNA is 5’ ACU 3’.
  • Assuming the promoter sequence is upstream of a double-stranded DNA, to find the sequence of the encoded RNA, determine the sequence complementary to the template strand, replacing T with U (Uracil).