22 Transcriptional Regulation in Eukaryotes

Introduction to Transcriptional Regulation
  • Context and Complexity: This lecture, given on Friday, 09/12/202509/12/2025 is a rerecording for improved sound quality, complementing a previous in-person lecture. It focuses on transcriptional regulation in eukaryotes, which is significantly more complex than in bacteria (as previously learned).

  • Connection to Chromatin Structure: Transcriptional regulation is strongly linked to chromatin structure, a topic covered in the preceding lecture on DNA packing and unpacking for efficient storage and control of gene expression.

  • Learning Objectives: The lecture provides detailed learning outcomes to guide study for examinations, quizzes, and deeper understanding.

Levels of Gene Expression
  • Variability Across Genes and Cell Types: Genes are expressed at vastly different levels, not only within a single cell type but also when comparing different cell types (e.g., brain vs. liver cells).

    • In Cell Type 1:

      • Gene A: Highly transcribed, producing many mRNA and protein molecules.

      • Gene B: Transcribed at a low level, resulting in a small amount of mRNA and low protein levels.

      • Gene C: Not transcribed at all.

    • In Cell Type 2:

      • Gene A: Completely downregulated and not transcribed, indicating its protein product is not required by this cell type at this time.

      • Gene B: Expressed at the same low level as in Cell Type 1.

      • Gene C: Strongly expressed, contrasting its non-expression in Cell Type 1.

  • Regulatory Elements: Gene expression requires specific, short DNA sequences called cis-acting elements (making up promoters) to which trans-acting factors (transcription factors, RNA polymerase) bind. Different genes require different subsets of transcription factors.

Types of Genes
  • Housekeeping Genes (Constitutive Genes):

    • Encode proteins essential for all cells (e.g., DNA polymerase, RNA polymerase, metabolic enzymes for ATP/metabolites).

    • Continuously transcribed (constitutively expressed).

    • Example: Gene B, expressed in both Cell Type 1 and Cell Type 2.

    • Expression levels can vary (low or high), depending on protein requirements.

  • Facultative Genes:

    • Can be induced (turned on) or repressed (turned off) in response to developmental, environmental, or other changes.

    • Example: Gene C (not transcribed in Cell Type 1, strongly expressed in Cell Type 2).

    • Example: Gene A (highly expressed in Cell Type 1, strongly repressed in Cell Type 2, though not completely off).

Prokaryotic vs. Eukaryotic Gene Structure and Regulation
  • Prokaryotic Gene Features:

    • Simplicity: Relatively simple gene structure without introns.

    • Regulatory Sequences: Simple cis-acting promoter sequences (e.g., around 35-35 and 10-10 positions relative to the transcriptional start site at +1+1), operator sequences, and terminator sequences.

    • RNA Polymerase: Only one type of RNA polymerase transcribes all gene types (protein-coding, rRNA, tRNA).

    • Operons: Often employ operons, which are polycistronic (cluster of protein-encoding sequences) transcribed as a single mRNA molecule. This allows coordinated expression of proteins involved in the same metabolic pathway (e.g., Lac Operon, Tryptophan synthesis). Ribosomes bind and translate each protein sequentially from the single mRNA.

  • Eukaryotic RNA Polymerases: Eukaryotes have three distinct RNA polymerases, each transcribes different gene types and requires different promoter structures and transcription factors.

    • RNA Polymerase I: Transcribes ribosomal RNA genes, including the 5.8S5.8S rRNA genes, the 18S18S rRNA genes, and the 28S28S rRNA genes (speaker mistakenly mentioned 20S20S rRNA but confirmed transcribes three rRNA genes).

    • RNA Polymerase II (Pol II): The focus of this lecture; transcribes all protein-coding genes (messenger RNAs) and some smaller RNA genes.

    • RNA Polymerase III: Transcribes the 5S5S rRNA gene, tRNA genes, and other small RNA genes.

  • Eukaryotic Gene Structure Features:

    • Monocistronic Genes: Each gene sequence encodes only for one single protein. This necessitates sophisticated coordinated regulation for proteins in the same pathway, achieved by common transcription factors binding to their respective gene promoters.

    • Exons and Introns: Eukaryotic genes contain exons (coding information, shown in red) interrupted by introns (non-coding regions). Introns generally must be removed through RNA processing (splicing), a key difference from prokaryotic genes which lack introns.

Eukaryotic Gene Promoter Structure
  • Complexity: Eukaryotic protein-encoding genes have complex promoter structures with multiple regulatory elements (rectangles representing DNA sequences for transcription factor binding).

  • Composite Core Promoter:

    • Includes motifs like the TATA sequence and an initiator sequence.

    • Located directly in front of or partially overlapping with the transcriptional start site (+1+1).

    • Forms the basic "core" where transcription initiation begins.

  • Proximal Promoter Elements:

    • Located further upstream (in the negative direction) of the core promoter.

    • DNA sequence motifs bound by transcription factors that modulate gene expression, for example, dictating cell-type specificity, or responsiveness to environmental/developmental signals.

  • Enhancer and Silencer Elements:

    • Crucial for the strength of expression.

    • Enhancers increase expression when bound by activating transcription factors.

    • Silencers decrease expression when bound by repressing proteins.

    • Location Variability: Can be located 10,00010,000 to 50,00050,000 base pairs upstream or downstream of the transcription start site, or even within coding sequences, making them challenging to identify without specific experiments.

  • Downstream Promoter Elements: Found within the actual coding region of a protein-coding gene.

  • Insulator Sequences:

    • Prevent regulatory elements of one gene from influencing the expression of adjacent genes.

    • In the context of chromatin, they prevent the spread of chromatin structure changes (e.g., histone acetylation/modification) from one gene region across the chromosome.

Demonstrating Regulatory Element Function: The Eve Gene Example
  • Experimental Approach: To show the functionality of short DNA sequences as regulatory elements, scientists use reporter gene assays.

  • Eve Gene in Drosophila melanogaster:

    • The Eve gene (Even-skipped) in fruit flies is expressed in distinct "segments" or stripes within the embryo.

    • A specific regulatory element (e.g., segment number two) from the Eve promoter is isolated.

    • Genetic Construct: This regulatory element is inserted upstream of a reporter gene (e.g., the bacterial LacZ gene, encoding beta-galactosidase) under the control of a minimal promoter.

    • Reporter Gene Activity: The beta-galactosidase enzyme converts a colorless substrate into a blue product. The presence of blue color indicates LacZ expression.

    • Results: The segment number two regulatory element specifically activates LacZ transcription, causing blue color development only in segment number two of the fruit fly embryo, demonstrating its specific role in spatially restricted gene expression. This method helps map specific regulatory elements to specific expression patterns.

Eukaryotic Transcription Factors
  • General Transcription Factors (GTFs):

    • A set of proteins crucial for positioning RNA polymerase II at the transcription start site.

    • They assemble to form the pre-initiation complex (PIC) along with RNA polymerase II.

    • Initiation Steps:

      1. TF2D Binding: The TF2D complex, containing the TATA-binding protein (TBP), recognizes and binds to the TATA box motif (a conserved TATA sequence) in the core promoter. TF2D also interacts with downstream promoter elements.

      2. Sequential Assembly: Subsequently, other GTFs like TF2B, TF2E, TF2F, and TF2H bind sequentially to the promoter.

      3. RNA Polymerase II Recruitment: TF2F, in conjunction with some other factors, recruits RNA polymerase II to the core promoter.

      4. PIC Formation: Once all GTFs and RNA polymerase II are assembled, the PIC is formed.

      5. TF2H Functions:

        • Uses ATP to unwind the DNA double helix at the transcription start site, creating a transcription bubble (single-stranded regions) for the RNA polymerase to read.

        • Phosphorylates the RNA polymerase II tail, activating the polymerase and allowing it to begin transcribing the gene.

  • Cell/Tissue-Specific or Regulatory Transcription Factors:

    • Bind to proximal promoter elements and distal enhancer/silencer elements.

    • Usually function as dimers (two protein units).

    • Possess a DNA binding domain to recognize specific DNA sequences.

    • Possess an activating or repressing domain to influence gene expression levels.

    • DNA Interaction: The DNA binding domain often features alpha helices (e.g., helix 33 shown with amino acids like serine, arginine, asparagine). These amino acids form non-covalent interactions (e.g., hydrogen bonds) with the nucleotide bases within the major groove of the DNA double helix (where bases are more accessible). These interactions are reversible.

Regulation of Transcription Strength
  • Integration of Signals: The strength and frequency of transcription initiation by RNA polymerase are determined by integrating information from:

    • Transcription factors bound to proximal promoter elements.

    • Transcription factors bound to distal enhancer/silencer elements (having positive or negative impacts).

  • The Mediator Complex:

    • A large mega-protein complex that mediates the influence of various transcription factors on transcription.

    • Acts as a bridge, interacting with both promoter-bound transcription factors and the RNA polymerase II/GTF complex.

    • Transcription factor binding causes conformational changes in the mediator, which are transferred to the RNA polymerase, influencing the stability of the PIC and the frequency of transcription initiation.

    • DNA Looping: Distal regulatory elements, located far from the transcription start site (10,00010,000 to 50,00050,000 bp away), can physically interact with the mediator complex and the PIC by causing the intervening DNA to loop or bend.

  • Chromatin Remodeling: The mediator complex and some transcription factors also recruit chromatin remodeling complexes and histone modifying enzymes.

    • These complexes and enzymes alter chromatin structure:

      • Histone acetyltransferases activate transcription by loosening histone-DNA interactions.

      • Chromatin remodeling complexes push histones apart, creating accessible DNA stretches for transcription factors (including TBP) and RNA polymerase. This links transcriptional regulation back to chromatin structure.

Combinatorial Control of Gene Expression
  • Diversity through Combination: Eukaryotic cells use a combinatorial control mechanism where different combinations of transcription factors yield diverse gene expression outcomes.

    • Homodimers vs. Heterodimers: Transcription factors often work as dimers.

      • Homodimers consist of two identical protein units binding to identical DNA elements (e.g., Gene 1: low expression; Gene 2: medium expression).

      • Heterodimers consist of two different transcription factor proteins interacting (e.g., TF-C with TF-D).

    • Complex Interactions: A heterodimer involving an activating domain (from TF-C) and an inhibitory domain (from TF-D) can result in no expression due to counteracting effects. Other heterodimer combinations can lead to high or other levels of expression.

  • Cell Differentiation Example:

    • A precursor cell divides, and daughter cells express different combinations of regulatory proteins (transcription factors).

    • Example: Three different regulatory proteins (TF1, TF2, TF3) in various combinations can generate 23=82^3 = 8 different cell types (e.g., liver, brain, heart, blood cells).

    • Efficiency: The human body has approximately 1,0001,000 estimated transcription factors regulating about 20,00020,000 genes. Combinatorial control significantly reduces the number of transcription factors needed, allowing specific control over gene expression and leading to diverse cell functionalities without an unmanageably growing genome for regulatory proteins.

Regulation of Transcription Factor Activity
  • Activation/Inhibition: The activity of transcription factors themselves can be activated or inhibited, often by small molecules.

  • Example: Cortisol Receptor:

    • Transcription factors for genes 1, 2, and 3 might lead to low-level expression on their own.

    • An inactive cortisol receptor normally cannot bind to DNA.

    • Upon binding of the small molecule cortisol, the receptor undergoes a structural change, becoming active and able to bind to specific DNA sequences near the transcription factors.

    • This activated cortisol receptor then interacts with the existing transcription factors, increasing their activity and leading to high levels of gene expression.

    • Conversely, small molecule binding can also decrease gene expression.

DNA Methylation: An Epigenetic Modification
  • Definition of Epigenetic Modification: Reversible, inheritable changes that do not alter the underlying DNA sequence (unlike genetic mutations). Histone modifications are another example.

  • Mechanism:

    • The addition of a methyl group (CH3CH_3) to the carbon at position 55 of a cytosine base, forming 55-methylcytosine.

    • This is not a change in the DNA sequence; a 55-methylcytosine is still read as a cytosine during replication.

  • CpG Motifs: Methylation typically occurs on cytosines within CpG motifs (CC followed by GG linked by a phosphate, written as 5CpG35'-CpG-3').

    • These motifs are symmetrical, meaning if a CC on one strand is methylated, the CC on the complementary strand within the CpG motif is also methylated.

  • DNA Methyltransferases (DNMTs):

    • De novo methylation: Enzymes (DNMTs) that add methyl groups to previously unmethylated cytosines.

    • Maintenance methylation: After DNA replication, parent strands are methylated, but newly synthesized strands are not. Maintenance DNA methyltransferases recognize these hemimethylated sites and methylate the unmethylated cytosine on the new strand, preserving the methylation pattern across cell divisions. This mechanism makes DNA methylation inheritable.

  • Impact on Transcription: Methylation of cytosines affects transcription in two main ways:

    1. Inhibition of Transcription Factor Binding: The methyl group can physically interfere with a transcription factor's ability to bind to its target DNA sequence, thereby preventing transcription activation.

    2. Chromatin Condensation: Methylated cytosines create new binding surfaces for proteins that recruit chromatin modification proteins. These lead to the formation of condensed chromatin, which is transcriptionally inactive, thus silencing genes in methylated regions.

      • Conversely, less methylation often correlates with open chromatin, increased DNA accessibility, and active gene transcription.

Practical Implications of Epigenetics
  • Identical Twins Study:

    • Young Twins (33 years old): Show very similar methylation patterns across chromosomes (e.g., chromosome 11) due to identical genomes and shared early environmental exposures.

    • Older Twins (5050 years old): Exhibit significantly more hypo- and hypermethylation across their chromosomes compared to younger twins, indicating that methylation patterns change throughout life.

    • Differences: When comparing older identical twins, their methylation patterns show distinct differences. This is attributed to divergent lifestyles and environmental exposures (e.g., city vs. countryside, smoking, diet, stress, chemical exposure), highlighting the profound impact of environment on epigenetic landscape. These differences can explain why one twin might develop cancer while the other does not.

  • Developmental Origins of Health and Disease (DOHaD):

    • A major epidemiological and experimental research field demonstrating that in utero (embryonic/fetal) and early childhood exposures significantly impact an individual's epigenetic methylation patterns.

    • These early-life exposures (e.g., maternal alcohol consumption, environmental toxins, maternal/paternal nutrition, diet) can lead to an increased risk of adult-onset diseases.

    • These disease risks are often attributed to epigenetic mechanisms rather than simple genetic mutations.

    • While a large percentage (95%\approx 95\%) of epigenetic changes are erased in germ cells (sperm and egg) to reset the epigenome for the next generation, a small, but significant, percentage of epigenetic marks (5%\approx 5\% or more) can be carried over, influencing fetal development, child health, and long-term disease susceptibility.

  • DNA Methylation in Cancer:

    • Tumor Suppressor Genes: In normal tissue, promoter regions of genes (especially tumor suppressor genes) typically have low levels of methylation, allowing for their expression.

    • Cancer Cells: In cancer cells, promoter regions of tumor suppressor genes often become hypermethylated.

    • Silencing: This hypermethylation leads to the silencing or decreased expression of these tumor suppressor genes.

    • Consequence: Loss of tumor suppressor function removes a natural defense mechanism against tumor development, increasing the likelihood of cancer progression.

    • Biomarker Potential: The distinct methylation patterns (e.g., hypermethylation of specific tumor suppressor gene promoters) in cancer cells occur very early in tumor formation. These patterns can be detected using relatively simple experiments and serve as potential biomarkers for early cancer detection, allowing for preventative medicine or targeted therapies.