Transcription and Gene Expression Mechanism and the Mechanisms of Gene Expression

Overview of Gene Expression and Central Dogma

  • Gene expression is the biological process of converting genetic information stored in DNA into functional proteins.
  • This process is divided into two primary stages:
    • Transcription: The conversion of DNA into messenger RNA (mRNAmRNA). This is the focus of Chapter 3.
    • Translation: The conversion of mRNAmRNA into a chain of amino acids, which ultimately forms a protein. This is the focus of Chapter 4.
  • The cell is the basic unit of living tissue. Within human cells, the nucleus contains the genome, split among 2323 pairs of chromosomes.
  • Each chromosome consists of a long strand of DNA tightly packaged around proteins known as histones. Specific sections of this DNA are called genes, which contain instructions for making proteins.

The Process of Transcription

  • Transcription is the first stage of gene expression.
  • When a gene is "switched on," an enzyme called RNA polymerase attaches to the start of the gene.
  • RNA polymerase moves along the DNA, unzipping the double helix and reading one of the two strands (the template strand).
  • It uses free bases in the nucleus to assemble a strand of mRNAmRNA.
  • The DNA code determines the order of the bases added to the mRNAmRNA. In RNA, the base Thymine (TT) is replaced by Uracil (UU), which is a close chemical cousin.
  • Transcription Unit: A region of DNA extending from the promoter to the terminator.
  • Coding Strand vs. Template Strand:
    • Template Strand: The strand used by RNA polymerase to make the complementary mRNAmRNA.
    • Coding Strand: The DNA strand whose sequence matches the resulting mRNAmRNA (with UU replacing TT).

Experimental Confirmation of messenger RNA

  • Scientists sought to determine how information moved from the nucleus (DNA site) to the cytoplasm (protein synthesis site).
  • An experiment was performed using E.coliE. coli bacteria:
    • Bacteria were grown for several generations on media containing heavy isotopes: Nitrogen-15 (15N^{15}N) and Carbon-13 (13C^{13}C).
    • All cellular components, including ribosomes, became "heavy labeled."
    • The bacteria were then infected with the T2 bacteriophage virus.
    • The phage destroyed the bacterial DNA and substituted its own genetic information to direct viral protein synthesis.
  • Hypothesis: If ribosomes carried specific gene information, new virus-specific ribosomes would need to be made. If ribosomes were passive sites, old bacterial ribosomes could be used to make viral proteins.
  • Results: Radio-labeled phage RNA was found associated with all the old bacterial ribosomes. This confirmed that ribosomes do not carry the genetic information themselves; rather, an unstable class of RNA (messenger RNA) carries the code from DNA to the ribosome.

Molecular Components of Transcription

  • Transcription requires an enzyme (RNA polymerase) to catalyze the formation of phosphodiester bonds.
  • Chemistry of Synthesis:
    • The hydroxyl (OHOH) group attacks the alpha phosphate of an incoming nucleotide, removing a pyrophosphate.
    • Requirements: A DNA template, and the nucleotides ATP, CTP, GTP, and UTP.
    • Direction: The RNA chain grows in the 55' to 33' direction.
    • Unlike DNA polymerase, RNA polymerase does not require an RNA primer to initiate synthesis.
  • Prokaryotic RNA Polymerase Subunits:
    • Core Enzyme: Composed of two α\alpha subunits, one β\beta subunit, one β\beta' subunit, and one ω\omega subunit.
    • Holoenzyme: The core enzyme plus the σ\sigma (sigma) factor. Only the holoenzyme can initiate transcription.
    • α\alpha Subunits: Encoded by RPOARPOA; helps recognize the UP element in strong promoters.
    • β\beta and β\beta' Subunits: Encoded by RPOBRPOB and RPOCRPOC; involved in DNA binding and phosphodiester bond formation.
    • σ\sigma Factor: Encoded by RPODRPOD; directs the enzyme to the promoter and provides specificity. It dissociates from the core enzyme after initiation and can be recycled.

Promoter Sequences and Numbering

  • Nucleotide Numbering:
    • The first nucleotide at the transcription start site is numbered +1+1.
    • Downstream sequences move in the positive direction (+2+2, +3+3, etc.).
    • Upstream sequences (before the start site) are numbered negatively (1-1, 2-2, etc.).
  • Prokaryotic Promoters:
    • Consists of two critical consensus sequences: the 10-10 TATA sequence and the 35-35 sequence.
    • Strong promoters may include an "UP element" located between 40-40 and 60-60.
  • Eukaryotic Promoters:
    • Primarily involves RNA Polymerase II for mRNAmRNA.
    • A key consensus sequence is the TATA box located at approximately 25-25.

Stages of Transcription: Initiation, Elongation, and Termination

  • Initiation:
    • The σ\sigma factor directs the holoenzyme to the promoter to form a "closed promoter complex."
    • The holoenzyme melts a short region of DNA (1010 to 1717 base pairs) to create an "open promoter complex," also called a transcription bubble.
    • The first base in the RNA is typically a purine (AA or GG).
  • Elongation:
    • Once the first few phosphodiester bonds are formed, the σ\sigma factor dissociates.
    • The core enzyme continues adding nucleotides sequentially.
    • The transcription bubble consists of approximately 1414 unpaired bases; the first 99 are used to transcribe new RNA.
    • Physical Topology: Opening the helix introduces positive supercoils ahead of the polymerase and negative supercoils behind it. Topoisomerase is required to unwind positive supercoils.
    • RNA polymerase may pause or backtrack to proofread the newly synthesized RNA.
    • Transcription Bubble Detection: Dimethyl sulfate (DMSDMS) transfers a methyl group to Adenine in open regions. S1 nuclease then cleaves the un-base-paired DNA, which can be visualized on a gel to identify the bubble's location (e.g., between 9-9 and +3+3).
  • Termination:
    • Intrinsic (Rho-independent) Terminator: Involves a GC-rich inverted repeat that forms a hairpin loop, followed by a string of 77 to 99 Uracil (UU) bases. This structure causes the polymerase to fall off.
    • Rho-dependent Terminator: Lacks the poly-U string. A protein factor called Rho follows the RNA polymerase and catches it when it pauses at a hairpin, unwinding the RNA-DNA hybrid to release the transcript.

Regulation of Bacterial Transcription

  • Gene expression is energetically expensive, so cells regulate which genes are active. E.coliE. coli has more than 3,0003,000 genes, but not all are transcribed simultaneously.
  • Strategies for Regulation:
    1. Alternative Sigma Factors: Viruses like SpO1SpO1 use specific proteins (gp28gp28, gp33gp33, gp34gp34) to redirect host RNA polymerase to viral genes. Similarly, E.coliE. coli uses σ32\sigma^{32} for the heat shock response.
    2. Anti-sigma Factors: The RSDRSD protein binds to σ70\sigma^{70} during environmental stress, blocking its activity.
    3. RNA Polymerase Switching: Bacteriophage T7 encodes its own RNA polymerase to transcribe its late-phase genes.
    4. Anti-termination: Proteins like the N and Q proteins in lambda phage prevent premature termination, allowing read-through into subsequent genes.
    5. Operons: Groups of contiguous, coordinate-controlled genes transcribed as a single "polycistronic" message.
    6. Transcription Attenuation: Premature termination based on leader sequences (e.g., the trptrp operon).
    7. Riboswitches: RNA regions in the 55' untranslated region (UTRUTR) that change conformation upon binding a ligand (e.g., FMN binding in the ribDribD operon).

The Operon Model and Positive/Negative Control

  • Negative Control (Induction): A repressor binds a promoter to inhibit transcription until an inducer is present (e.g., Lac operon).
  • Negative Control (Repression): An inactive repressor becomes active only in the presence of a co-repressor (e.g., Tryptophan operon).
  • Positive Control: An activator (like the CAP-cAMP complex) must bind to the promoter for full transcription activity.
  • The Lac Operon:
    • In the absence of lactose, the lacIlacI gene produces a repressor that binds the operator, blocking transcription.
    • In the presence of lactose, the inducer binds the repressor, causing it to release the operator.
    • Catabolite Repression: If glucose is present, cAMP levels are low. If glucose is absent, cAMP levels rise, forming the CAP-cAMP complex, which acts as an enhancer for the lac operon.
    • The lac operon is fully active only when glucose is absent and lactose is present.

Eukaryotic Transcription Machinery

  • Eukaryotes use three distinct RNA polymerases:
    • RNA Polymerase I: Located in the nucleolus; synthesizes large ribosomal RNA (rRNArRNA) precursors (28S28S, 18S18S, 5.8S5.8S).
    • RNA Polymerase II: Located in the nucleoplasm; synthesizes mRNAmRNA precursors.
    • RNA Polymerase III: Located in the nucleoplasm; synthesizes tRNAtRNA precursors, 5S5S rRNArRNA, and small nuclear RNAs.
  • Transcription Factors (TFTF):
    • General Transcription Factors (GTFs): Required for basal level transcription. For Pol II, the assembly sequence is: TF2DTF2D (containing TATA-binding protein or TBPTBP) + TF2ATF2A \u2192 TF2BTF2B \u2192 TF2FTF2F/Polymerase II \u2192 TF2ETF2E \u2192 TF2HTF2H.
    • Specific Transcription Factors (Activators): Bind to enhancers and contain DNA-binding motifs.
  • DNA-Binding Motifs:
    • Zinc-containing modules: e.g., Zinc finger (coordinated by two Cys and two His) or the GAL4GAL4 bimetal cluster (six Cys at a zinc ion).
    • Homeodomain: Contains 6060 amino acids and a helix-turn-helix motif.
    • BZIP and BHLH: Dimerize via a leucine zipper.

Epigenetic Regulation in Eukaryotes

  • DNA is packaged into nucleosomes: 147147 base pairs of DNA wrapped in 1.751.75 turns around an octamer of core histones (H2AH2A, H2BH2B, H3H3, and H4H4).
  • Histone Modification: Covalent changes (acetylation, methylation, phosphorylation) to positively charged histone tails alter their interaction with negatively charged DNA, making DNA more accessible to RNA polymerase.
  • Chromatin Remodeling: Complexes use energy to move nucleosomes so RNA polymerase can access genes.
  • Overcoming the Nucleosomal Barrier: During elongation, polymerase moves through nucleosomes via nucleosome mobilization (octamer transfer) or H2A/H2BH2A/H2B dimer depletion.
  • Heterochromatin vs. Euchromatin:
    • Euchromatin: Open, accessible, and transcriptionally active.
    • Heterochromatin: Tightly bundled, repressive, and methylated (often at CpG islands).

Post-Transcriptional mRNA Processing

  • Eukaryotic genes contain exons (coding) and introns (intervening sequences).
  • Three primary modifications occur after transcription:
    1. Five' Cap (55' cap):
    • Step 1: RNA triphosphatase removes one phosphate.
    • Step 2: RNA guanine transferase adds a GMP molecule.
    • Step 3: Guanine N7 methyltransferase adds a methyl group to the added guanine.
    • Function: Protects RNA and serves as a binding site for ribosomes.
    1. Polyadenylation (33' tail):
    • Cleavage occurs after a CACA dinucleotide between the AAUAAAAAUAAA hexamer and a downstream GU-rich region.
    • Poly A polymerase adds approximately 200200 Adenosine (AA) residues to the site.
    1. Splicing:
    • Carried out by the spliceosome (comprising snRNPs U1U1, U2U2, U4U4, U5U5, and U6U6).
    • Introns typically start with GUGU (at the 55' splice site) and end with AGAG (at the 33' splice site).
    • Mechanism: Two-step transesterification forms a lariat structure from the intron, which is then released and degraded.
    • Alternative splicing allows a single gene to produce multiple protein variants.

Non-Coding RNAs and RNA Interference (RNAiRNAi)

  • Approximately 9898% of the transcriptional output of the human genome is non-coding RNA.
  • Small Non-coding RNAs:
    • microRNA (miRNAmiRNA): Endogenous; cut from larger hairpin precursors by the enzyme Dicer into approximately 2222 nucleotides. They often bind imperfectly to targets to inhibit translation.
    • small interfering RNA (siRNAsiRNA): From exogenous sources (viruses or synthetic). Cut by Dicer into 2121 to 2525 nucleotides. They match targets perfectly, leading to mRNA cleavage.
  • Mechanism of Action:
    • Dicer cuts the RNA, which is then passed to the RNA-induced silencing complex (RISCRISC).
    • The Argonaut protein in the RISCRISC complex guides the antisense strand to the target mRNAmRNA.
    • RITSRITS (RNA-induced initiation of transcriptional silencing) complexes can also recruit chromatin remodeling enzymes to methylate DNA and histones, resulting in transcriptional silencing.