Genetics of Microorganisms - Module 2: Transposition and Chromosome Structure
Transposition
- Involves the integration of small DNA segments into new genomic locations.
- Can occur at many different locations within the genome.
- Transposable elements (TEs) are small, mobile DNA segments, sometimes called "jumping genes."
- First identified by Barbara McClintock in corn during the early 1950s.
- Found in diverse species: bacteria, fungi, plants, and animals.
Barbara McClintock
- Awarded the Nobel Prize in Physiology or Medicine in 1983 for her discovery of mobile genetic elements.
Different Transposition Pathways
- Transposable elements (TEs) move via different transposition pathways.
- Two general types:
- Simple transposition
- Retrotransposition
Simple Transposition
- Widely used by transposons in bacterial and eukaryotic species.
- TE is removed from its original site and transferred to a new target site.
- Mechanism: cut-and-paste.
Retrotransposition
- Transposable elements (retrotransposons) move via an RNA intermediate.
- Transcribed into RNA.
- Found only in eukaryotic species.
Insertion Elements and Simple Transposons
- All TEs are flanked by direct repeats (DRs): identical base sequences oriented in the same direction.
- Simplest TE is an insertion element, flanked by inverted repeats.
- Inverted repeats: identical (or very similar) DNA sequences running in opposite directions, ranging from 9 to 40 bp.
- May contain a gene for transposase, which catalyzes transposition.
LTR Retrotransposons
- Evolutionarily related to retroviruses.
- Retain the ability to move around the genome but generally do not produce mature viral particles.
- Contain long terminal repeats (LTRs) at both ends (typically a few hundred base pairs in length).
- Code virally related proteins, such as reverse transcriptase and integrase, needed for retrotransposition.
Non-LTR Retrotransposons
- Do not resemble retroviruses in having LTR sequences.
- May contain a gene encoding a protein that functions as both a reverse transcriptase and an endonuclease.
- Some are evolutionarily derived from normal eukaryotic genes.
- Example: Alu family of repetitive sequences in humans, derived from the 7SL RNA gene. Approximately 1 million copies exist.
Transposase
- Catalyzes the excision and insertion of TEs.
- Recognizes inverted repeats at the ends of a TE, bringing them close together.
- Transposase cleaves the target DNA at staggered sites.
- The transposable element is inserted into the target site.
- DNA gap repair synthesis results in direct repeats.
Simple Transposition and Copy Number
- Transposition can increase copy number.
- Occurs after the replication fork has passed through the TE, creating two copies.
- One TE can transpose ahead of the fork, where it is copied again.
- One chromosome retains one TE, while the other gains an additional copy.
Retrotransposons and Reverse Transcriptase
- Retrotransposons use an RNA intermediate.
- LTR retrotransposon movement requires reverse transcriptase (synthesizes double-stranded DNA from RNA template) and integrase (recognizes LTRs, cuts target site, and inserts TE).
Target-Site Primed Reverse Transcription
- Non-LTR retrotransposons move via target-site primed reverse transcription.
- Retrotransposon transcribed into RNA with a 3′ polyA tail.
- Target DNA recognized by endonuclease.
- PolyA tail binds to nicked site.
- Reverse transcriptase uses target DNA as a primer to make a DNA copy of the RNA.
Transposable Elements: Mutation and Evolution
- Occur in the genomes of all species.
- Abundance varies across species:
- Frog (Xenopus laevis): 77%
- Corn (Zea mays): 60%
- Human (Homo sapiens): 45%
- Mouse (Mus musculus): 40%
- Fruit fly (Drosophila melanogaster): 20%
- Nematode (Caenorhabditis elegans): 12%
- Yeast (Saccharomyces cerevisiae): 4%
- Bacterium (Escherichia coli): 0.3%
Repetitive Sequences in Eukaryotic Genomes
- Some repetitive sequences are due to TE proliferation.
- In mammals:
- LINEs (Long interspersed elements): 1,000 to 10,000 bp long, 20,000 to 1,000,000 copies per genome, ~17% of the human genome.
- SINEs (Short interspersed elements): < 500 bp long. Example: Alu sequence (~1,000,000 copies in the human genome, 10% of the genome).
Biological Significance of Transposons
- Evolutionary role debated.
- Selfish DNA theory: TEs exist because they can proliferate without significantly harming the host.
- TEs offer some advantage (e.g., bacterial TEs carrying antibiotic-resistance genes).
- TEs may cause greater genetic variability through recombination.
- TEs may cause exon shuffling (insertion of exons into coding sequences), leading to genes with more diverse functions.
Transposons and Rapid Genomic Change
- Transposable elements can rapidly enter and proliferate in a genome.
- Example: P element in Drosophila melanogaster, introduced around the 1950s, has since spread worldwide.
- Transposable elements have a variety of effects on chromosome structure and gene expression.
Consequences of Transposition
- Chromosome Structure
- Chromosome breakage: Excision of a TE.
- Chromosomal rearrangements: Homologous recombination between TEs at different locations.
- Gene Expression
- Mutation: Incorrect excision of TEs.
- Gene inactivation: Insertion of a TE into a gene.
- Alteration in gene regulation: Transposition of a gene next to regulatory sequences or vice versa.
- Alteration in exon content: Insertion of exons into the coding sequence via TEs (exon shuffling).
Regulation of Transposition
- Many outcomes are harmful, so transposition is highly regulated.
- Occurs in few individuals under certain conditions.
- Agents like radiation, chemical mutagens, and hormones can stimulate TE movement.
Structure of Eukaryotic Chromosomes in Nondividing Cells
- Linear DNA must be tightly compacted to fit within the nucleus (2 to 4 µm in diameter), despite being over 1 meter long if stretched end to end.
- Compaction involves interactions between DNA and proteins.
- DNA-protein complex is called chromatin.
- Proteins bound to DNA are subject to change, affecting chromatin compaction.
Nucleosomes
- The repeating structural unit within eukaryotic chromatin.
- Composed of a double-stranded DNA segment (146 bp) wrapped 1.65 times around a histone octamer.
- Histone octamer: two copies each of four different histone proteins.
Histones
- Histone proteins are basic (contain many positively charged amino acids like lysine and arginine).
- Bind to negatively charged phosphates along the DNA backbone.
- Have a globular domain and a flexible, charged amino terminus (tail).
- Five types of histones:
- H2A, H2B, H3, and H4 (core histones): Two of each form the octamer.
- H1 (linker histone): Binds to DNA in the linker region, helps organize adjacent nucleosomes.
Nucleosome Organization
- H1 histone not bound: beads on a string.
- H1 histone bound to linker region: nucleosomes more compact.
Nucleosomes and Chromatin Structure
- 30nm fiber model, depicted long-range interactions of nucleosomes to form a fiber; this model is no longer accepted
- Nucleosomes associate with each other to form a more compact structure.
- Histone H1 plays a role in this compaction.
- Zigzag model: linker DNA is relatively straight, and nucleosomes form a zigzag arrangement (occurs over short distances).
Further Compaction
- Chromatin can be further compacted into loop domains.
- Loop extrusion model:
- SMC proteins (structural maintenance of chromosomes): Form a dimer that wraps around two DNA segments to form a loop.
- CCCTC binding factor (CTCF): Stabilizes loops by binding to DNA and then to each other.
Topologically Associated Domains (TADs)
- Chromatin is organized into TADs (approximately 100 kb to 1 Mb in length).
- DNA segments within a TAD are more likely to interact with each other.
- TAD boundaries are determined by SMC proteins and CTCFs.
- Promote interactions within a TAD and prevent interactions between different TADs (act as insulators).
Heterochromatin and Euchromatin
- Heterochromatin: Tightly compacted regions of chromosomes, transcriptionally inactive.
- Euchromatin: Less condensed regions of chromosomes, transcriptionally active.
Types of Heterochromatin
- Constitutive heterochromatin: Always heterochromatic, permanently inactive, contains highly repetitive sequences.
- Facultative heterochromatin: Can interconvert between euchromatin and heterochromatin.
Chromosome Territories
- Each chromosome in the cell nucleus is found in a discrete chromosome territory.
Four-Level Hierarchy of Chromosome Organization
- Level 1: Chromosomes occupy distinct territories; interchromosomal interactions play a role in chromosomal arrangements.
- Level 2: Active genes associate with other active genes in euchromatin; repressed genes cluster together in heterochromatin.
- Level 3: Chromosomes are organized into structural domains known as TADs (100 kb to 1 Mb).
- Level 4: Chromosomal organization is largely determined by the affinity of nucleosomes for one another; short tri- or tetranucleosomes are arranged in a zigzag manner.
Chromosome Structure During Cell Division
- During M phase, the level of compaction changes dramatically.
- By the end of prophase, sister chromatids are entirely condensed.
- Metaphase chromosomes undergo little gene transcription.
- Chromatin within metaphase chromosomes is organized along a scaffold.
- Scaffold contains SMC proteins.
Condensin I and II
- Two multiprotein complexes that help form and organize metaphase chromosomes: Condensin I and Condensin II.
- Play a critical role in chromosome condensation during M phase of mitosis and meiosis.
- Both contain SMC proteins (structural maintenance of chromosomes).
- SMC proteins use energy from ATP to catalyze loop formation.
Condensin Function
- During early prophase, condensin II enters the nucleus and plays a role in condensation.
- Condensin I binds to chromatin only after the nuclear envelope breaks apart.
- After interphase, both condensin I and II facilitate the reorganization of chromosomes into radial loop arrays.
- Condensin II forms a spiral scaffold and creates large loops attached to the scaffold in radial loop arrays.
- Condensin I promotes the formation of smaller loops within the larger loops formed by Condensin II.
Cohesin
- A multiprotein complex that also contains an SMC protein.
- Found along the entire length of each chromatid until the middle of prophase.
- Promotes binding between sister chromatids during mitosis and meiosis.
- After the middle of prophase, cohesin is found only at the centromere.
- Cohesin at the centromere is degraded at the start of anaphase.