Transcription Regulation Notes

Why is Transcriptional Regulation Needed?

  • Allows for the development of different tissues.
  • Facilitates the transition from childhood to adulthood.
  • Enables reaction to environmental cues.
  • Deregulation can lead to uncontrolled growth, such as in cancers.
  • Example: transition from fetal to adult hemoglobins, involving changes in expressed protein subunits and coordinated functions post-birth.

What Determines When/How Genes Are Transcribed?

  • Continued chromatin structure.
  • RNA polymerase and general transcription factor binding specificity.
  • Additional binding and activation factors.

Heterochromatin

  • Expressed genes are found in an ‘open’ conformation (euchromatin).
  • Genes within highly packed heterochromatin are usually not expressed.
  • Histone code involves silencing and activation.
  • Specific constitutive heterochromatin structures: origins of replication, telomeres, and centromeres.
  • Facultative heterochromatin - cell-type-specific, can switch into euchromatin following developmental cues.

Heterochromatin and the Histone Code

  • H3K27me3H3K27me3: Inactivation of HOX genes and X chromosome inactivation.
  • H3K9meH3K9me: Associated with organogenesis.
  • H3K27me3H3K27me3 binds ‘polycomb’ proteins, which remodel chromatin.
  • H3K4me2H3K4me2: Centromeric.
  • Constitutive heterochromatin: regions consistently silenced in all cell types, such as centromeres, telomeres, some transposons, and gene-poor regions; governed by histone methyltransferases.

Coding Back to Euchromatin

  • H3K4meH3K4me and H3K9acH3K9ac: Gene expression.
  • H3S10pH3S10p and H3K14acH3K14ac: Gene expression.
  • Actively expressed regions (green) are central in the eukaryotic nucleus.
  • Heterochromatic regions (red) are close to the nuclear membrane and associated with lamins.
  • DNA can be moved to the nuclear lamina/membrane for transcriptional inactivation.

Three Functional Heterochromatic Elements

  • Telomeres
  • Replication origins
  • Centromeres

Human Centromere Organization

  • Normal H3K4me2H3K4me2
  • Centromere-specific H3
  • Pericentric and centric regions
  • Microtubules and cohesin linking sister chromatids
  • Kinetochore
  • Centric heterochromatin: long, highly repetitive chromatin structures.
  • H3K4me2H3K4me2 allows an open structure, permitting kinetochore attachment.

Telomeres

  • Telomeres are repetitive structures that vary in length and DNA sequence.
  • Examples of telomere repeats:
    • GGGTTG in the ciliate Tetrahymena
    • GGGTTA in vertebrates
    • G13AG_{1-3}A in Saccharomyces cerevisiae.
  • Vertebrate sequences are repeated over several kb; yeast telomeres are several hundred bp.
  • Telomeres shorten after each cell division due to the 5’ to 3’ synthesis of DNA and the erasure of RNA primers.

Telomerase

  • Telomerase binds the single-stranded G-overhang, and is a ribonucleoprotein (RNP) enzyme.
  • Composed of telomerase RNA (TER) and telomerase reverse transcriptase protein (TERT).
  • Telomerase extends the 3’ end of the parental strand using its own RNA subunit as a template.
  • Tom Cech discovered TERT.

Compensatory Mechanism for Telomere Shortening

  • RNA-templated DNA synthesis.
  • DNA primase lays down an RNA primer on the extended G-overhang.
  • DNA-templated DNA synthesis by DNA Polymerase extends this primer 5’-3’.
  • DNA ligase ligates the new Okazaki fragment to the old lagging strand 5’ end.
  • There is still a free 3’ unpaired end, which normally triggers repair mechanisms.
  • Extended G-overhang.

Shelterin Complex / Telosome

  • The 3’ end shelters from repair mechanisms by base pairing.
  • A shelterin complex includes TRF1 (telomeric repeat-binding factor 1), TRF2 (telomeric repeat-binding factor 2), RAP1 (repressor/activator protein), and others.
  • Stimulates t-loop formation.
  • Displaces a d-loop and results in the base pairing of the 3’ end.
  • The 3’ end shelters from repair mechanisms in a telosome.

Werner Syndrome

  • Werner syndrome occurs if we cannot replicate our telomeres.
  • Inheritance: autosomal recessive
  • Incidence: 1 in 1,000,000, but in Japan and Sicily, it is 1 in 30,000.
  • The WRN helicase protein is important for telomeric DNA replication.
  • Telomeres replicated by lagging strand synthesis are not efficiently replicated in Werner cells.
  • Overexpression of telomerase maintains telomere length in Wrn cells.
  • The WRN helicase protein is also important for DNA repair, resolving recombination structures (Holliday junctions), and governing Okazaki fragment joining in eukaryotes.

How is Transcription Controlled?

  • What determines how/when/why genes are transcribed?
  • Chromatin structure
  • RNA polymerase (and general TF) binding specificity
  • Additional binding and activation factors

Eukaryotic RNA Polymerases

  • Relatively scarce, about 0.001% of total cell protein.
  • Highly complex, typically 12 subunits (compare with bacterial: five subunits).
  • Three different enzymes, functionally and biochemically distinct:
    • RNA polymerase I (Pol I): 5.8S, 18S, 28S rRNA genes
    • RNA polymerase II (Pol II, RNAP): all protein-coding genes, snoRNA genes (small nucleolar), miRNA genes (micro), siRNA genes (small interfering), most snRNA genes (small nuclear)
    • RNA polymerase III (Pol III): tRNA genes, 5S rRNA genes, some snRNA genes, other small RNA genes
  • RNA polymerases in bacteria, archaea, and eukaryotes are closely related: the basic features of the enzyme were in place before the divergence of the three major branches of life.
  • Structural similarity: bacterial RNAP and eukaryotic RNA Pol II.
  • Grey: extra eukaryotic subunits.

Pol II Promoter

  • RNA Pol II promoter regions contain multiple cis-acting elements that bind proteins; individual elements may not always be present; Pol II promoters are very variable.
  • 'core' promoter regions, ~ -50 to +50 bp are depleted of nucleosomes: accessible.
  • Elements include:
    • BREd (downstream B recognition element): binds TFIIB
    • TATA box: binds TATA-binding protein (TBP)
    • BREu (upstream B recognition element): binds TFIIB
    • Inr (initiator element): binds TFIID

TBP (part of TFIID)

  • TBP binds the TATA box, and the strength of binding regulates transcription.
  • BREd binds TFIIB
  • TATA box binds TBP
  • BREu binds TFIIB
  • Inr binds TFIID

TATA Box

  • The TATA box is a consensus sequence: individual TATA boxes have different affinities for TBP – and so some are more efficient at stimulating transcription than others.
  • The TATA box is a consensus sequence.
  • Individual TATA boxes have different affinities for TBP, so some are more efficient at stimulating transcription than others.

G-less Cassette Transcription Assay

  • A radioactive RNA transcript of a defined size (typically ~ 400 bp) is produced.
  • It can be electrophoresed through polyacrylamide gels and quantified following autoradiography.
  • Principle: Because no GTP is supplied, the RNA is truncated at the point at which a ‘G’ should be inserted.

Different TATA Sequences Support Different Levels of Transcription

  • TATAAAA (AdML): Human TFIID binds strongly and supports high expression of the G-less cassette.
  • TATAAAG (Yeast His): Human TFIID binds less strongly to the yeast His TATA box - reduced expression.

TATA Boxes Regulate Expression

  • Even in viruses.
  • EBV productive cycle:
    • Immediate Early (IE) and Early (E) genes (n = 35)
    • Late (L) genes (n = 33)
  • L genes have a distinct TATT motif, but IE and E genes have a TATA motif, providing temporal control.
  • IE and E genes are differentially expressed, indicating other control sequences.

EBV and cis-acting elements

  • Typical L gene: TATTAA, minimal complexity
  • Typical E gene: TATAAA, intermediate complexity, TATA box with both proximal and distal positive cis-acting elements (green) that enhance transcription
  • Typical IE gene: TATAAA, high complexity, TATA box with proximal positive cis-acting elements (green) that enhance transcription and both proximal and distal negative cis-acting elements (red) that inhibit transcription

Additional Binding and Activation Factors

  • Activator proteins:
    • NF-κB
    • Interferon response factor
    • ATF-2/c-Jun

How is Transcription Controlled? (recap)

  • What determines how/when/why genes are transcribed?
  • Chromatin structure
  • RNA polymerase binding specificity
  • Additional binding factors
    • Activator/repressor proteins
    • Mediator proteins
    • Chromatin-modifying proteins

Additional Binding Factors – Activators (and Repressors)

  • Eukaryotic RNA polymerases cannot access DNA directly: they require additional factors.
  • General transcription factors assemble at the promoter and form a complex with RNA PolII.
  • Specific transcription factors regulate the rate of transcription: activators increase transcription, repressors decrease transcription.
  • These transcription factors bind the proximal promoter elements and the distal (enhancer) elements.
  • They have a modular design: a DNA-binding domain that binds specific DNA sequences and an activating/repressing domain (protein interaction domain) that stimulates/inhibits transcription by interacting with mediator proteins, general transcription factors, or RNA PolII.
  • DNA–binding domains:
    • Homeodomains
    • Zinc finger motifs
    • Leucine zippers

DNA-binding domains: homeodomains

  • Helix 3 binds in the major groove of DNA making specific interactions between amino acids and nucleotides.

DNA-binding domains: Zinc Finger Motifs

  • Zinc finger: a small structural motif with key Cys and His residues that coordinates a zinc ion (Zn2+Zn^{2+}), stabilizing the fold.
  • Often there is a cluster, arranged one after the other so that the α-helix of each binds the major groove of the DNA.
  • A strong and specific DNA-protein interaction is built up through a repeating basic structural unit.

DNA-binding domains: leucine zippers

  • Leucine zippers can also bind to DNA as heterodimers, expanding the potential regulatory repertoire.
  • Three distinct DNA-binding specificities can be generated from two types of leucine zipper monomer, while six can be created from three types of monomer, and so on.
  • Heterodimeric binding is common (e.g., Jun/Fos heterodimers).

Multimerisation and Combinatorial Control

  • In T-cells (example):
    • AP-1 complex (c-Jun/c-Fos)
    • Nuclear factor of activated T-cells (NFAT)
  • Combinatorial regulation is a powerful mechanism that enables tight control of gene expression, via integration of multiple signaling pathways that induce different transcription factors required for enhanceosome assembly.

What is the Enhanceosome?

  • For genes that require tight control, activators bind cooperatively along an enhancer sequence, forming an enhanceosome.
  • Each enhanceosome is unique to its specific enhancer.
  • It recruits coactivators and general transcription factors to the promoter region of the target gene to begin transcription.
  • It also recruits non-histone architectural transcription factors called high-mobility group (HMG) proteins, which regulate chromatin structure – they ensure that the target gene can be accessed by transcription factors.

Repressors

  • In this example, four cis-acting elements, with four specific binding proteins, but their orientation and ability to bind co-activators and co-repressors increases the repertoire of responses

Putting it all together

  • Expression of an RNA PolII-transcribed gene depends on the integrated output of:
    • DNA methylation status
    • Chromatin structure and histone modification status
    • General transcription factors and RNA Pol II
    • Regulatory complexes bound both upstream and downstream of the gene
    • Mediator proteins

Philadelphia (Ph) Chromosome

  • All cases of chronic myeloid leukemia (CML) carry a Philadelphia re-arrangement.
  • Inappropriate expression of a fusion protein that is permanently locked ON.

Burkitt’s Lymphoma

  • Inappropriate regulation of an oncogene by a very strong enhancer.