Gene Transcription and RNA Modification

Fundamental Definitions and the Central Dogma

  • Gene: At the molecular level, a gene is defined as a segment of DNA used to produce a functional product, which can be either an RNA molecule or a polypeptide.
  • Transcription: This represents the first step in gene expression. Literally meaning \"the act or process of making a copy,\" in genetics, it refers to copying a DNA sequence into an RNA sequence.   - A crucial feature of transcription is that the structure of the DNA is not altered; it remains intact to continue storing information.
  • The Central Dogma of Genetics:   - DNA Replication: Processes that make DNA copies to be transmitted from cell to cell and parent to offspring.   - Chromosomal DNA: Functions as the storage unit for information in units called genes.   - Transcription: Produces an RNA copy of a specific gene.   - Messenger RNA (mRNA): A temporary copy of a gene that provides the information required to synthesize a polypeptide.   - Translation: The process of producing a polypeptide using the information encoded in mRNA.   - Polypeptide: Becomes part of a functional protein which contributes to an organism's physical traits.
  • Gene Expression: The overall process by which information within a gene is used to produce a functional product. This product, in conjunction with environmental factors, determines an organism's traits.

Organization of Bacterial Gene Sequences

  • Regulatory Sequences: These are sites where regulatory proteins bind. Their primary role is to influence the rate of transcription. Regulatory sequences can be located in various positions relative to the gene.
  • Promoter: The specific DNA site for RNA polymerase binding; it signals the start of transcription.
  • Terminator: A DNA sequence that signals the end of transcription.
  • Ribosome-Binding Site: In bacteria, this is the site where the ribosome binds to the mRNA to initiate translation. Translation begins near this site.   - Eukaryotic comparison: In eukaryotes, ribosomes bind to a 7methylguanosine7-methylguanosine cap and scan for a start codon.
  • Codons: These are 33-nucleotide sequences within mRNA that specify particular amino acids.   - Start Codon: Specifies the first amino acid in a polypeptide. Usually, this is formylmethionine in bacteria and methionine in eukaryotes.   - Stop Codon: Specifies the end of polypeptide synthesis.
  • Polycistronic mRNA: Bacterial mRNA can be polycistronic, meaning a single mRNA molecule may encode two or more distinct polypeptides.

Strands and Base Sequences in Transcription

  • Template Strand: The DNA strand that is actually transcribed. It is also known as the antisense strand. The resulting RNA transcript is complementary to this strand.
  • Coding Strand: The DNA strand opposite the template strand. It is also called the sense strand or nontemplate strand.   - The base sequence of the RNA is identical to the coding strand, with the exception that Uracil (UU) in RNA substitutes for Thymine (TT) in DNA.
  • Transcription Factors: Proteins that recognize the promoter and regulatory sequences to control/regulate the transcription process.

The Three Stages of Transcription

Transcription involves protein-DNA interactions, specifically between RNA polymerase and DNA sequences, occurring in three distinct phases:

  1. Initiation:     - The promoter serves as a recognition site for transcription factors.     - Transcription factors enable RNA polymerase to bind to the promoter.     - DNA is denatured into a bubble called the open complex.
  2. Elongation (Synthesis of RNA transcript):     - RNA polymerase slides along the DNA within the open complex.     - RNA is synthesized in the 55' to 33' direction.
  3. Termination:     - A terminator sequence is reached.     - RNA polymerase and the new RNA transcript dissociate from the DNA.

Bacterial Promoters and Initiation

  • Promoter Characteristics: Promoters are upstream of the transcriptional start site. The start site is labeled as the +1+1 position.
  • Numbering System:   - Bases are numbered relative to the start site (+1+1).   - There are no \"zero\" bases; the base preceding +1+1 is 1-1.
  • Key Promoter Sequences in Bacteria:   - 35-35 Sequence: Typically TTGACATTGACA.   - 10-10 Sequence: Typically TATAATTATAAT (also known as the Pribnow box).   - Consensus Sequence: The most common sequence found at these positions. Sequences that match the consensus result in high transcription levels; deviations lead to lower levels.   - Examples (Promoter 35-35 / 10-10 / Spacing):     - lac operon: TTTACATTTACA / TATGTTTATGTT / N17N_{17}     - trp operon: TTGACATTGACA / TTAACTTTAACT / N17N_{17}     - recA: TTGATATTGATA / TATAATTATAAT / N16N_{16}
  • RNA Polymerase Holoenzyme (E. coli):   - Core Enzyme: Composed of five subunits ( ̑_2̒̒' ω ).   - Sigma Factor (σσ): A single subunit required for initiation. It recognizes the 35-35 and 10-10 sequences using a helix-turn-helix structure for tight binding.
  • Initiation Process Details:   - Holoenzyme binds loosely, then scans for the promoter.   - Closed Complex: Formed when RNA polymerase binds the promoter.   - Open Complex: Formed when the TATAATTATAAT box in the 10-10 sequence is unwound (facilitated by easier separation of ATA-T bonds).   - Synthesis of a short RNA strand occurs, after which the Sigma factor is released, signaling the start of elongation.

Elongation and Termination in Bacteria

  • Elongation Phase:   - Synthesis rate is approximately 4343 nucleotides per second.   - The open complex is roughly 1717 bases long.   - RNA polymerase moves 33' to 55' along the template strand.   - DNA rewinds into a double helix behind the moving open complex.
  • Termination Mechanisms:   1. Rho-dependent (ρρ-dependent) Termination: Requires the ρρ (rho) protein, which binds to a sequence in the RNA called the rut site (rho utilization site). A stem-loop structure forms, causing RNA polymerase to pause, allowing the rho protein to catch up and separate the RNA-DNA hybrid.   2. Rho-independent Termination: Does not require rho. It is facilitated by a uracil-rich sequence at the 33' end of the RNA and an upstream stem-loop structure that causes termination.

Eukaryotic Transcription and RNA Polymerases

  • Complexity: Eukaryotic transcription is more complex due to multicellularity, specialized organelles, and the need for cell-specific gene regulation.
  • RNA Polymerases (Nuclear DNA):   - RNA Pol I: Transcribes all rRNA genes (except 5S5S rRNA).   - RNA Pol II: Transcribes all protein-encoding (structural) genes into mRNA and some snRNA genes used for splicing.   - RNA Pol III: Transcribes all tRNA genes, the 5S5S rRNA gene, and microRNA genes.
  • Eukaryotic Promoters:   - Core Promoter: Short, contains the TATA box and transcriptional start site. It produces basal transcription (low level).   - Regulatory Elements: Enhancers (stimulate transcription) and Silencers (inhibit transcription). Often found in the 50-50 to 100-100 region.   - Cis-acting elements: DNA sequences like the TATA box and enhancers that affect only the proximate gene.   - Trans-acting factors: Regulatory proteins that bind to cis-acting elements.
  • Basal Transcription Proteins:   - RNA Polymerase II.   - Five General Transcription Factors (GTFs).   - Mediator: A protein complex that interacts with GTFs and RNA Pol II.

Eukaryotic Termination and RNA Modification

  • RNA Pol II Termination Models:   - Allosteric Model: After passing the polyadenylation signal, RNA Pol II becomes destabilized and dissociates.   - Torpedo Model: An exonuclease binds the 55' end of trailing RNA and degrades it until it reaches the polymerase, forcing termination.
  • Colinearity: In bacteria, DNA, mRNA, and polypeptide sequences are colinear. In eukaryotes, they are not always colinear due to introns.   - Exons: Coding sequences.   - Introns: Intervening sequences (non-coding).
  • RNA Splicing Mechanisms:   1. Group I: Self-splicing; involves a free guanosine binding within the intron. Found in simple eukaryotes (Tetrahymena) and some organellar DNA.   2. Group II: Self-splicing; uses the 2OH2'-OH group of an internal adenosine to initiate catalysis. Found in organellar DNA and rarely in bacteria.   3. Spliceosome (Pre-mRNA): Requires snRNPs (small nuclear ribonucleoproteins). It is common in eukaryotic nuclear protein-encoding genes.
  • Specific RNA Modifications:   - Processing: Cleavage of large precursor RNAs (rRNA/tRNA) into smaller functional pieces.   - 5' Capping: Attachment of a 7methylguanosine7-methylguanosine (m7Gm7G) cap. Necessary for nuclear exit, stability, and ribosome binding.   - 3' PolyA Tailing: Enzymatic addition of a string of Adenines to the 33' end. Important for stability and translation.   - RNA Editing: Changing the base sequence post-transcription (e.g., deamination, additions/deletions). First seen in trypanosomes.   - Base Modification: Covalent changes (e.g., methylation) common in tRNAs.
  • Alternative Splicing: A mechanism where a single pre-mRNA can be spliced in different ways to produce multiple distinct polypeptides (e.g., ̑ -tropomyosin in smooth vs. striated muscle). Regulated by splicing factors (repressors and enhancers).
  • Spliceosome Action: Subunits (snRNPs like U1,U2,U4,U5,U6U1, U2, U4, U5, U6) recognize boundaries, hold RNA in configuration, and use a metalloribozyme (Mg2+Mg^{2+}) to catalyze intron removal and exon linkage via a lariat formation.