Reverse Transcription and Integration - Vocabulary Flashcards
Introduction
- Topic: Reverse Transcription and Integration in retroviruses; essential steps for replication and targets for antiretroviral therapy.
- Nobel Prize context: Interaction between tumour viruses and the genetic material of the cell; Temin’s discovery of RNA-dependent DNA synthesis vs Baltimore’s RNA polymerase work; joint Nobel Prize in Physiology or Medicine in 1975 to Temin, Baltimore, and Temin’s collaborator Dulbecco for work on retroviral interaction with host genome.
- Proviral DNA and reservoirs: Integrated proviral DNA can be transcriptionally silent for extended periods, creating viral reservoirs that can reactivate, underpinning persistence and challenges to cure (e.g., HIV-1).
- Two key retroviral activities highlighted: reverse transcription (RT) and integration by integrase (IN).
Retroviridae and Genome Organization
- Family name origin: Retroviruses use the enzyme reverse transcriptase (RT) to convert RNA genome to DNA (RNA→DNA), reversing the central dogma direction.
- Retroviral genome basics:
- All retroviruses carry three essential genes: gag (structural proteins), pol (enzymes; includes RT, integrase IN, protease PR), and env (envelope).
- Simple retroviruses (e.g., avian leukosis virus, ALV) have only gag, pol, env.
- Complex retroviruses (e.g., HIV-1) include these three plus additional non-structural genes; these extra products can influence replication and pathogenesis.
- Gene expression:
- Viral mRNAs and proteins produced via host transcription/translation machinery.
- Gag products are expressed from unspliced transcripts as precursors and are proteolytically cleaved into mature proteins.
- Env is expressed from spliced transcripts and cleaved into surface (SU) and transmembrane (TM) components.
- Retrovirus structure:
- Enveloped viruses; envelope contains SU and TM proteins attached to a lipid bilayer.
- Inside the envelope: matrix (MA), nucleocapsid (NC), capsid (CA), enzymes RT, IN, PR, and the viral RNA genome.
- Each viral protein is labeled with a two-letter designation.
- Genome organization in particles:
- Each virion contains two copies of the single-stranded positive-sense RNA genome (a 70S dimer) that associate as a dimer via base pairing at the 5’ ends.
- In progeny virions, typically only one gene copy is carried forward during productive replication.
- HIV-1 and complex retroviruses have additional gene content beyond basic gag-pol-env; reviews cover these extra genes and products.
Retrovirus Replication Overview
- Core replication steps:
- Binding/entry: virus fuses with host cell membrane via envelope–receptor interactions.
- Reverse transcription: RNA genome converted to double-stranded proviral DNA by RT within the preintegration complex.
- Nuclear import and integration: proviral DNA integrated into host genome by integrase (IN).
- Transcription/translation: host machinery expresses viral mRNAs and proteins.
- Genome packaging and assembly: progeny virions assemble at the cell membrane and bud.
- Key implication: Integration into host genome is central to persistence and pathogenesis; integrated proviral DNA can serve as a reservoir for reactivation.
- Short note on pathology: HIV-1 reservoirs contribute to immune evasion and chronic infection; eradication/cure remains challenging.
Retroviral Reverse Transcription (Section 2: Retrovirus Reverse Transcription)
- Reverse transcription purpose: RT converts the single-stranded RNA genome into double-stranded proviral DNA.
- Background: Viruses in Retroviridae use RT; retroviruses are named for this reverse transcription step.
- Basic RT components:
- Viral RNA genome (template).
- Primer: a host cell tRNA bound to a specific primer binding site (pbs) on the viral genome; tRNA serves as primer for first DNA synthesis.
- RT enzyme: delivered in roughly ~ ext{50--100} copies per virion; performs RNA→DNA synthesis and RNase H activity.
- Genome packaging and primer details:
- Each virion carries ~ ext{2 copies} of the RNA genome, which dimerize via 5’ end interactions to form a 70S complex; NC (nucleocapsid) proteins coat the RNA and promote template exchanges during RT.
- tRNA primers are partially denatured and bound to the PBS; specific tRNAs (e.g., tRNA^Trp in ASLV) are used as primers in a sequence-specific manner.
- RT activities within the preintegration complex:
- RNA-dependent DNA synthesis (primary activity).
- Helicase-like unwinding to relieve torsional stress during DNA synthesis.
- RNase H activity degrades RNA in RNA:DNA hybrids, enabling synthesis of the second DNA strand and template exchanges.
- Reverse transcription is multi-step and involves template switches between the two RNA genomes in the virion (pseudodiploidy) to allow complementation if one template is damaged.
- Structure of RT across retroviruses:
- RT enzymes vary by virus: avian sarcoma/leukosis virus (ASLV) RT is a heterodimer with polymerase and RNase H domains in both subunits; HIV-1 RT is a heterodimer where only one subunit has RNase H and the other lacks integrase activity; murine leukemia virus (MLV) RT is a monomer with RNase H domain but no integrase activity.
- Kinetics and fidelity of RT:
- RT polymerization rate is slow relative to cellular DNA polymerases: about 1 ext{ to }1.5 ext{ nt s}^{-1}, leading to total synthesis times of roughly ext{4 hours} for HIV-1 proviral DNA, given genome length of ≳9{,}000 bases.
- Fidelity is low: no proofreading/editing; misincorporations occur at roughly 1 ext{ error per } 10^{4} ext{ to } 10^{6} ext{ nucleotides}. For an HIV-1 genome (~10^{4} bases), this yields about one error per genome per replication cycle, driving rapid evolution and the emergence of quasispecies.
- This high mutation rate underpins drug resistance via mutations that impair RT inhibitors while preserving enzymatic activity.
- Bottom-line features of RT in retroviruses:
- RT is the central enzyme for converting RNA to DNA; it has DNA polymerase, RNase H, and helicase-like activities.
- RT fidelity is low, generating genetic diversity that fuels adaptation and drug resistance.
- RT is a major target for antiretroviral therapy (ART).
The Reverse Transcription Process: Stepwise View (Steps 1–4)
- Step 1: Initiation of (−) strand DNA synthesis
- Primer: host tRNA binds to PBS at the 5’ end of the RNA genome; RT copies the 5’ end to generate a short DNA strand.
- RNase H degrades the RNA template portion that has been copied, producing the initial short DNA product; this is called the negative strong-stop DNA (≈100 nucleotides).
- Step 2: First Template Exchange
- The 3' end of the nascent (−) DNA binds to the 3' end of the viral RNA (RNA template) at a complementary R region.
- Copying continues toward the 5’ end of the genome; the RNA template is degraded as synthesis proceeds by RNase H.
- Step 3: Initiation of (+) strand DNA synthesis
- RNase H leaves behind a short RNA segment called the polypurine tract (PPT), about 13 ext{ to }15 bases, which serves as a primer for (+) strand DNA synthesis.
- The positive strand DNA is extended through the tRNA-derived region (about 18 nucleotides) and through the tRNA primer region; elongation is blocked by a modified tRNA base, forming a positive strong-stop DNA.
- The ppt and tRNA primer are degraded by RNase H; the negative-strand DNA extension continues by copying against the PBS region to pair with the newly synthesized positive strand DNA.
- A second template exchange is prepared as the newly synthesized DNA ends anneal to the complementary PBS region on the positive strand.
- Step 4: Second Template Exchange and Completion
- The circular DNA construct formed by negative strand DNA and the positive strong-stop DNA is used as the starting point.
- Positive strand synthesis proceeds from the PBS, while the negative strand is completed by displacement from the strong-stop DNA.
- Result: a linear proviral DNA with identical long terminal repeats (LTRs) at both ends.
- LTRs: multi-functional elements; 5’ LTR acts as promoter; 3’ LTR provides poly(A) tail sequences; LTR ends serve as binding handles for integrase during integration.
- Outcome: production of a provirus capable of integration into the host genome and subsequent transcription/translation of viral genes.
Proviral DNA and Integration (Section 3: Retrovirus Integration)
- Purpose of integration: covalent insertion of proviral DNA into the host genome by integrase; essential for persistent replication and transcriptional control by host RNA polymerase II.
- Integrase (IN): encoded in the pol gene; packaged in virions with RT and PR; IN performs the covalent integration step.
- Preintegration complex (PIC): includes proviral DNA, RT, integrase, capsid proteins, nucleocapsid proteins; responsible for transporting proviral DNA into the nucleus for integration.
- Equimolar packaging: in productive replication, virions contain equimolar amounts of RT, IN, and PR.
- Early clues about integration: end-terminal dinucleotide sequences of integrated proviruses (5’ TG and 3’ CA ends) point to specific processing by integrase and a conserved mechanism across retroviruses; homologous relationships to bacterial transposons and transposases supported an integrase-like activity.
- In vitro integration assays: cell-free systems with purified integrase, cofactors (e.g., Mg^{2+}), short target DNA, and proviral DNA surrogates; used to study structure–function relationships and sequence requirements; panels A and B examine individual steps; panel C examines concerted integration (both ends).
- The integration reaction is sequential and virus-specific; the preintegration complex must reach the nucleus for integration to occur.
The Integration Process: Stepwise Reactions (Steps 1–3)
- Step 1: Processing reaction
- Integrase removes two terminal dinucleotides from each end of the linear proviral DNA, generating 5' overhangs; only intact proviral DNA can be processed and integrated; this step requires both ends.
- The processing depends on terminal LTR sequences, particularly the conserved 5'-CA-3' dinucleotide at the ends; short base-pair preferences exist within the first ~20 base pairs of LTR termini; sequence conservation varies by retrovirus (e.g., HIV-1 vs MLV vs ASLV).
- Step 2: Joining reaction
- Integrase makes staggered cuts in the host DNA at sites separated by about 4 ext{ to }6 base pairs; the 5' ends of proviral DNA are then ligated to the host DNA, creating partial gaps where the 5' ends are still unjoined (
a gapped intermediate). - Integrase completes its catalytic role; following this, the enzyme dissociates from the DNA.
- Step 3: Repair
- Cellular DNA repair machinery fills in the single-stranded gaps and ligates the remaining ends, restoring a continuous double-stranded proviral-host DNA junction.
- This repair is analogous to nonhomologous end-joining processes that follow double-strand breaks.
- Proviral DNA organization after integration: proviral ends flanked by host DNA, with LTRs at both ends; integrated proviral DNA now relies on host transcriptional machinery to produce viral RNAs and proteins.
Integration Site Preferences and Host Interactions
- Integration site determinants:
- Integration site selection is only loosely dependent on primary host DNA sequence; there is some preference for weak palindromic motifs near ends of LTRs, variable by retrovirus identity.
- Chromatin structure and DNA topology influence site selection: bends, underwinding, and nucleosome positioning influence where integration occurs.
- Gene structure preferences:
- HIV-1 tends to integrate within genes; MLV tends to integrate near transcription start sites.
- Cellular factors can modulate integration preferences; for example, LEDGF/p75 (LEDGF) contributes to integration targeting toward active genes; loss of LEDGF reduces bias toward active genes.
- Integrase structure and function:
- IN is ~300 amino acids long and comprises three domains:
- N-terminal domain (NTD): ~first 50 aa with zinc-chelating His and Cys residues; supports DNA binding.
- Catalytic core domain (CCD): ~central region (~150 aa) containing the invariant catalytic motif D, D(35)E essential for catalysis.
- C-terminal domain (CTD): last ~80–100 aa; important for DNA binding and multimerization.
- Multimeric forms:
- Dimers can complete processing in vitro but cannot perform all steps of integration alone.
- Tetramers (intasomes) are required for concerted integration (both viral ends into target DNA).
- Inner IN molecules are catalytically active in the tetramer; outer subunits’ roles are less clearly defined.
- Cellular proteins participating in integration (~100 identified):
- Partners include barriers to autointegration factor (BAF), LEDGF/p75 (LEDGF) and BRD family members, and other proteins involved in transcription, chromatin remodeling, and DNA repair.
- BAF helps prevent autointegration (integration into other proviral DNA), acting as a pro-replication factor.
- In vitro vs in vivo considerations:
- In vitro assays provide mechanistic insights; in vivo integration is influenced by chromatin context, host factors, and nuclear trafficking of the preintegration complex.
HIV-1 Reservoirs and Pathogenesis Implications
- Proviral reservoirs: integrated proviral genomes can persist in resting or latent cellular states; reactivation can occur under certain cellular conditions, leading to renewed viral replication even after apparent control.
- Integration targeting and gene disruption: insertion into or near host genes can affect gene expression and contribute to pathogenesis or clonal expansion of infected cells.
- Therapeutic implications: targeting integrase (IN) activity is a core component of antiretroviral therapy; understanding integration site preferences informs strategies to eradicate reservoirs and prevent persistence.
RT and Integration: Additional Context and Variants
- Non-retroviral reverse transcription (retroid viruses):
- Hepadnaviruses (e.g., hepatitis B virus) use reverse transcription to synthesize DNA from an RNA intermediate; these are non-retroviruses but share RT-based replication elements.
- Endogenous proviruses in host genomes represent replication-defective vestiges of past retroviral infections.
- Practical takeaway: RT and IN are unique to retroviruses and a central focus for therapy; understanding them also informs broader questions about genome evolution and host interactions.
Summary of Key Points
- Retroviruses rely on two essential enzymatic activities: reverse transcription (RT) and integration (IN).
- RT converts the RNA genome into double-stranded proviral DNA; the process is multi-step and involves template switching, RNase H activity, PPT, tRNA priming, and LTR formation.
- RT fidelity is low, driving high mutation rates and the emergence of viral quasispecies and drug resistance.
- Proviral DNA acquired by RT is integrated into the host genome by integrase, forming an enduring provirus flanked by LTRs and leading to viral gene expression via host transcription machinery.
- Integration is a multi-step, enzyme-mediated process (processing, joining, repair) and is influenced by virus-specific end sequences, chromatin context, and host factors (e.g., LEDGF, BAF).
- The proviral reservoir and integration targeting heavily influence HIV pathogenesis, persistence, and challenges to eradication.
- Genome length for HIV-1: L hicksim 9{,}000 ext{ bases}
- RT polymerization rate: r allingdotseq 1 ext{ to } 1.5 rac{ ext{nucleotides}}{ ext{second}}
- Time to complete reverse transcription (HIV-1): t hicksim 4 ext{ hours}
- Genomic misincorporation rate: ext{error rate per nucleotide} hickapprox 10^{-4} ext{ to } 10^{-6}
- Proviral DNA ends after processing: 5' ext{ overhangs} (requires intact proviral DNA)
- PB s (primer binding site) ends: conserved region; PPT length: 13 ext{ to } 15 ext{ bases}
- PPT-derived primer length: ext{approx. } 13 ext{--}15 ext{ nt}
- PPT length relative to tRNA primer region: 18 ext{ nt} extension before elongation proceeds past tRNA region
- End dinucleotides for integration ends: 5' - TG ext{ and } 3' - CA
- Host DNA cleavage during joining: 4 ext{ to } 6 ext{ base pairs apart}
- Integrase size: ext{≈} 300 ext{ amino acids}
- Number of copies of RT per virion: ext{≈ } 50 ext{ to } 100 copies
- Two viral genome copies per virion: 2 genomes per virion
- Dimerization: 70S ribosome-like complex for genome packaging in virion particle
- Integrase multimerization: tetramer (intasome) required for concerted integration
References and Further Reading
- Primary sources: Nobel Prize 1975 for Temin and Baltimore work on tumor viruses and genetic material interaction.
- Textbook reference: Principles of Virology, Volume I, Chapter 10 (5th Edition) and related reviews for HIV-1 and retroviral integration.
- Online references cited in the source material: Nobel Prize summary (1975); RT and mutation references on general RT biology.