Transcription and RNA Processing

The Central Dogma and Information Flow

  • The fundamental pathway of biological information flow is described by the Central Dogma: DNARNAProteinDNA \rightarrow RNA \rightarrow Protein.
  • DNA Replication: The process by which DNADNA makes a copy of itself (DNADNADNA \rightarrow DNA), catalyzed by DNAPolymeraseDNA Polymerase.
  • Transcription: The synthesis of RNARNA from a DNADNA template (DNARNADNA \rightarrow RNA), catalyzed by RNAPolymeraseRNA Polymerase.
  • Translation: The decoding of RNARNA into a sequence of amino acids to form a protein (RNAProteinRNA \rightarrow Protein), occurring at the RibosomeRibosome.
  • Heritability: Watson and Crick noted that the specific pairing of nitrogenous bases suggests a possible copying mechanism for genetic material. DNA is the heritable material of life.
  • Protein Function: For proteins, structure equals function. The 3D3D shape of a protein dictates its activity. This shape is determined by the specific sequence of amino acids, which is ultimately dictated by the DNADNA sequence.

Gene Expression and Amplification

  • Definition of Expression: A gene is considered "expressed" when the information within the DNADNA affects the cell's properties and activities (e.g., when a protein-coding gene results in a functional protein).
  • Gene to Protein Relationship: In eukaryotic cells, one gene typically contains instructions for building one protein (with some complexities such as alternative splicing).
  • Amplification: Transcription allows for the creation of many identical copies of RNARNA from a single DNADNA gene. Subsequently, translation allows for many identical proteins to be made from a single RNARNA molecule.   - This process facilitates a rapid and dramatic response to cellular changes or needs.   - Amplification is a luxury, not a necessity; the level of protein production must be matched by the level of regulated gene expression.
  • Regulation: Both the timing and the amount of RNARNA produced are strictly regulated during transcription.

Chemical and Structural Differences: DNA vs. RNA

  • Nucleotide Structure: Both are linear polymers of four nucleotides held together by 55' to 33' phosphodiester bonds.
  • Sugar Component:   - RNA uses Ribose, which has a hydroxyl (OH-OH) group at the 22' position.   - DNA uses Deoxyribose, which has a hydrogen (H-H) at the 22' position.
  • Bases:   - RNA contains Adenine (AA), Guanine (GG), Cytosine (CC), and Uracil (UU).   - DNA contains Adenine (AA), Guanine (GG), Cytosine (CC), and Thymine (TT).   - Uracil (UU) in RNARNA base-pairs with Adenine (AA) just as TT does in DNADNA.
  • Physical Structure:   - DNA is typically a double-stranded, stable, and static information store.   - RNA is usually single-stranded. This allows for intra-molecular base pairing, resulting in complex 3D3D folding and catalytic properties.

The RNA World Hypothesis

  • This hypothesis suggests that RNARNA was the basis of all life before the evolution of DNADNA and proteins.
  • Comparitive properties:   - DNA: Information storing, stable, static.   - RNA: Somewhere in-between; stores information but is unstable; possesses catalytic activity but is inefficient; can be static or dynamic.   - Protein: Information-empty, catalytic, dynamic.

Diversity and Function of RNA Molecules

  • Messenger RNA (mRNA): Comprises approximately 5%5\% of total cellular RNARNA. It serves as a "middle man" carrying genetic information from the nucleus to the cytosol for translation.
  • Ribosomal RNA (rRNA): Accounts for approximately 80%80\% of total cellular RNARNA. These molecules associate with proteins to form ribosomes.   - Eukaryotic rRNAs include: 18S,28S,5S, and 5.8S18S, 28S, 5S, \text{ and } 5.8S.   - Ribosomes possess "peptidyl transferase" activity, which is catalyzed by ribozymes (the RNA itself).   - Genes for rRNA are clustered in the Nucleolus.
  • Transfer RNA (tRNA): The smallest of the three major RNAs. They act as adaptors between mRNAmRNA codons and amino acids during protein synthesis.
  • Micro RNA (miRNA): Single-stranded molecules approximately 212321-23 nucleotides in length. They are transcribed but not translated; they regulate gene expression by binding to and down-regulating mRNAmRNA.
  • Other Small RNAs: Used in processes such as RNARNA splicing and telomere maintenance.

The Mechanism of Transcription

  • Unwinding: The process begins with the opening and unwinding of the DNADNA double helix immediately "upstream" (55') of a gene.
  • Strand Usage:   - Template Strand: The specific DNADNA strand used as a guide for RNARNA synthesis. RNAPolymeraseRNA Polymerase reads the template in the 353' \rightarrow 5' direction.   - Non-template (Coding) Strand: The DNADNA strand whose sequence matches the newly synthesized RNARNA (except with TT instead of UU).
  • RNA Polymerase (RNAP):   - Catalyzes the formation of phosphodiester bonds between ribonucleoside triphosphates (NTPsNTPs).   - Unlike DNAPolymeraseDNA Polymerase, RNAPolymeraseRNA Polymerase can initiate synthesis without a primer.   - The high-energy bonds of the NTPsNTPs power the reaction.   - Speed: Approximately 30bases/second30\,\text{bases/second}.   - Error Rate: 1 in 1041 \text{ in } 10^4. This higher error rate is tolerated because RNAs are short-lived and mutant proteins are transient.
  • Simultaneous Transcription: Multiple RNAPolymerasesRNA Polymerases can transcribe a single gene simultaneously because the RNARNA strand does not remain base-paired with the DNADNA template; the DNADNA closes back up immediately behind the enzyme.

Transcription in Prokaryotes (Bacteria)

  • Promoter Recognition: Bacterial RNAPolymeraseRNA Polymerase has a general weak affinity for DNADNA. It "scans" the helix until it encounters a Promoter sequence.
  • Sigma (\sigma) Factor: A subunit of the bacterial RNAPolymeraseRNA Polymerase complex responsible for recognizing and binding to the promoter.
  • Initiation and Termination:   - Once initiated, the σ factor\sigma \text{ factor} dissociates, allowing the enzyme to continue elongation.   - Transcription continues until the polymerase hits a Terminator sequence (stop signal).
  • Simplicity: In bacteria, newly transcribed mRNAs are immediately bound by ribosomes for translation because there is no nuclear compartment.

Transcription in Eukaryotes

  • Eucaryotic Complexity: Eukaryotic cells have three types of RNAPolymeraseRNA Polymerase:   - RNAP I: Transcribes most rRNArRNA genes.   - RNAP II: Transcribes protein-coding genes, miRNAmiRNA genes, and genes for some small RNAs (e.g., those in spliceosomes).   - RNAP III: Transcribes tRNAtRNA genes, the 5SrRNA5S rRNA gene, and other small RNAs.
  • General Transcription Factors (GTFs):   - Eukaryotic RNAPsRNAPs cannot initiate transcription alone; they require GTFs to recruit the enzyme, position it, separate the DNADNA, and launch the polymerase.   - TFIID: The first GTF to bind. It contains the TBP (TATA-binding protein) subunit.   - TATA Box: A specific promoter sequence rich in TT and AA (e.g., TATAATTATAAT) found approximately 2525 bases upstream (25-25) of the start site.   - DNA Kinking: Binding of TBPTBP to the TATA box kinks the DNADNA approximately 9090^{\circ}, facilitating the assembly of other factors.   - Basal Transcription Complex: Includes TFIIA,TFIIB,TFIID,TFIIE,TFIIF, and TFIIHTFIIA, TFIIB, TFIID, TFIIE, TFIIF, \text{ and } TFIIH.
  • RNAP II Tail Phosphorylation:   - RNAPIIRNAP II has a carboxyl-terminal domain (CTDCTD) or "tail."   - TFIIH contains a kinase subunit that phosphorylates the tail to trigger the release of the enzyme from the promoter and signal the start of elongation.   - Only unphosphorylated RNAPIIRNAP II can be recruited to a promoter; phosphates must be stripped for recycling.

Gene Anatomy and Regulatory Regions

  • Exons: The protein-coding regions of a gene.
  • Introns: Noncoding intervening sequences that interrupt exons. They can range from a single nucleotide to 10,00010,000 nucleotides.
  • Consensus Sequences: Conserved markers (like the TATA box) found at recognition sites.
  • Regulatory Regions:   - Basal Promoter: Includes the proximal component (TATA box) for positioning and a distal component (e.g., CAATCAAT or GCGC boxes) that specifies the frequency of initiation.   - Enhancers/Repressors: Sequences that regulate expression levels. They can function at great distances (hundreds/thousands of bases) from the start site and are orientation-independent.
  • Upstream and Downstream: Noncoding regions at the 55' end are "upstream"; regions at the 33' end are "downstream."

Eukaryotic mRNA Processing

  • Compartmentalization: Since eukaryotes have a nucleus, mRNAs must be processed and inspected before export to the cytoplasm. This provides Quality Control.
  • 5' Cap: A methylated guanosine residue added in an atypical way to the 55' end. It protects from 55' exonuclease degradation and helps in ribosome binding.
  • Poly-A Tail: A sequence of approximately 200250200-250 adenine nucleotides added to the 33' end by Poly(A)PolymerasePoly(A) Polymerase.   - The Polyadenylation Signal (AAUAAAAAUAAA) is recognized by an endonuclease that cleaves the transcript before the tail is added.
  • Circularization: Interaction between proteins at the 55' cap and the PolyAPoly-A tail causes the mRNAmRNA to circularize, signaling that it is genuine and intact.

RNA Splicing and the Spliceosome

  • Splicing: The process of removing introns and joining exons together to form a mature, all-coding mRNAmRNA.
  • Spliceosome: A large complex made of five small nuclear RNAs (U1,U2,U4,U5, and U6U1, U2, U4, U5, \text{ and } U6) and over 5050 proteins. These are collectively called snRNPs ("snurps").
  • Mechanism:   - Specific sequences at the intron ends (GU at the 55' end and AG at the 33' end) mark the splice sites.   - A "branch point" Adenine (AA) attacks the 55' splice site.   - The intron is removed in the shape of a Lariat (a loop structure) and is discarded.
  • Function of Introns: While often viewed as "junk" or evolutionary relics, introns allow for Alternative Splicing.

Alternative Splicing and Modularization

  • Alternative Splicing: Splicing non-adjacent exons together to create different protein variants from a single gene.   - Exon order is always preserved; introns are always discarded.   - Approximately 60%60\% of human genes undergo alternative splicing, greatly increasing "protein potential."
  • Modular Protein Synthesis: Exons often encode individual functional modules called Domains. Splicing allows these domains to be "traded" or combined like attachments on a kitchen tool (e.g., a mixer with multiple possible attachments like a juicer or pasta roller).

Nuclear Export and mRNA Lifespan

  • Nuclear Pore Complex (NPC): Highly selective gates. mRNAs are only allowed out if they are bound by specific proteins at the cap, splice junctions, and tail (e.g., Cap-binding protein, Exon Junction Complex (EJC), and Poly-A-binding protein).
  • Degradation: All RNAs are eventually degraded by RNases.   - Lifespan varies from minutes to days.   - Lifespan is often controlled by sequences in the 3' UTR (3' Untranslated Region), which lies past the protein-coding sequence.   - Inhibition: Regulatory factors (like miRNAs) can bind to the 3UTR3' UTR to modulate translation or stability.

Clinical Correlation and Pharmacology

  • Retroviruses (e.g., HIV): Possess an RNARNA genome and use Reverse Transcriptase to copy RNARNA into DNADNA. This is the reverse of the usual information flow. The viral DNADNA then integrates into the host genome.
  • Antiviral Drugs:   - AZT (Zidovudine) and DdI (Dideoxyinosine): Nucleotide analogues that inhibit reverse transcriptase. Host kinases phosphorylate them, and they are incorporated into viral chains, causing chain termination.
  • Antibiotics:   - Rifampin: Specifically inhibits bacterial RNAPolymeraseRNA Polymerase by interfering with the enzyme-promoter complex. It is a primary treatment for Tuberculosis.   - Isoniazid: Often used alongside rifampin as an antimetabolite for tuberculosis treatment.