Lecture 11

HIV-1 Gene Expression and the Role of TAT

  • Host Cell Machinery Utilization: Once the proviral DNA is integrated into the host cell chromatin, it is treated similarly to a cellular gene. It is transcribed by host RNA polymerase II and processed using the host cell's splicing machinery.

  • LTR Regulatory Signals: Basic retroviruses (encoding GAG, POL, and ENV) use Long Terminal Repeats (LTRs) for transcription control.

    • 5' LTR: Contains an enhancer and promoter within the U3 region.

    • Transcription Start Site: Located at the beginning of the R (Repeat) sequence.

    • Poly A Tail Signal: Located at the end of the R sequence in the 3' LTR.

  • Complex Retrovirus Features: HIV-1 is a complex retrovirus that utilizes additional proteins to maximize gene expression efficiency. The U3 region of the 5' LTR contains numerous binding sites for mammalian transcription factors (especially those found in T cells) which recruit RNA Pol II.

  • TAT (Trans-activator of Transcription):

    • Encoding: TAT is encoded by two exons that are spliced together.

    • Mechanism of Action: Unlike typical mammalian transcription factors that bind to DNA, TAT binds to newly synthesized viral RNA.

    • TAR (Transactivation Response Element): The 5' terminus of the viral RNA folds into a stem-loop structure called TAR. TAT specifically binds to a "bulge" within this stem-loop.

    • Transcription Elongation: In the absence of TAT, RNA Pol II initiates transcription at the LTR but is poorly processive, frequently falling off the template and producing only short RNAs (containing the TAR loop). In the presence of TAT, the complex converts to a fully processive mode.

    • P-TEFb Complex (TAK): TAT recruits a complex of Cyclin T1 and CDK9, historically known as TAK (TAT associated kinase), now called P-TEFb (Positive Transcription Elongation Factor b).

    • Hyperphosphorylation: CDK9 hyperphosphorylates the C-terminal domain (CTD) of RNA Pol II. While TF2H initiates the process, P-TEFb-mediated hyperphosphorylation converts Pol II into a highly processive enzyme capable of transcribing the entire 9kb9\,\text{kb} genome.

Viral RNA Classes and the REV Export Pathway

  • RNA Length Classes: Transcription produces three main classes of viral RNA:

    • Full-length RNA (9kb9\,\text{kb} class): Acts as the genome for new progeny and as mRNA for GAG and POL polyproteins.

    • Singly spliced RNA (4kb4\,\text{kb} class): Produced by a single splicing event (removing GAG and POL); encodes VIF, VPR, VPU, and ENV proteins.

    • Multiply spliced RNA (2kb2\,\text{kb} class): Produced by further splicing; encodes TAT, REV, and NEF proteins.

  • The Nuclear Export Problem: Host cell nuclear export pathways evolved to transport only fully spliced mRNAs to prevent the translation of nonsense proteins from unspliced messages. Consequently, the 9kb9\,\text{kb} and 4kb4\,\text{kb} HIV RNAs (hwhich still contain splice donor/acceptor sites) are normally retained in the nucleus.

  • REV (Regulator of Expression of Virion Proteins):

    • Role: REV is a nuclear export factor that allows the export of unspliced or partially spliced viral RNAs.

    • RRE (REV Recognition Element): REV binds to the RRE, a highly structured RNA region within the ENV gene (present in the 9kb9\,\text{kb} and 4kb4\,\text{kb} classes but spliced out of the 2kb2\,\text{kb} class).

    • Multimerization: REV binds to a high-affinity site on the RRE and then multimerizes along the structure.

    • Export Mechanism: REV recruits the host export factor CRM1 and RanGTP. This complex transports the RNA through the nuclear pore. Once in the cytoplasm, GTP is hydrolyzed, the complex dissociates, and the RNA is released for translation.

  • Temporal Gene Expression:

    • Early Phase: The 2kb2\,\text{kb} RNAs are exported via the host pathway. TAT, REV, and NEF are produced. TAT and REV proteins then move back into the nucleus.

    • Late Phase: TAT increases transcription, and REV facilitates the export of 9kb9\,\text{kb} and 4kb4\,\text{kb} RNAs, leading to the production of structural proteins (GAG, ENV) and enzymatic proteins (POL).

Translation and Ribosomal Frame Shifting

  • Genome Economy: HIV uses strategies like alternative splicing and ribosomal frame shifting to encode multiple proteins within a small genome.

  • GAG and POL Synthesis: Both are translated from the 9kb9\,\text{kb} mRNA but reside in different, overlapping reading frames.

    • Normal Translation (90%90\%): Ribosomes initiate at the GAG start codon, translate the full GAG polyprotein, and terminate at the GAG stop codon.

    • Frame Shifting (10%10\%): Approximately 11 in every 1010 ribosomes undergoes a 1-1 frameshift to produce a GAG-POL fusion protein.

  • The Frameshift Mechanism:

    • Ribosomal Pausing: A tight stem-loop structure in the RNA causes the ribosome to pause. This structure blocks the mRNA entrance tunnel.

    • Slippery Sequence: The ribosome pauses with its A and P sites over a poly-U sequence (UU-rich region). Because AUA-U base pairs only have 22 hydrogen bonds (compared to 33 in GCG-C), the interaction is less stable.

    • The Shift: During the pause, the ribosome slips back by a single nucleotide (1-1), bypassing the GAG stop codon and continuing in the POL reading frame.

  • Evolutionary Significance: There is an overlap of approximately 200200 nucleotides between GAG and POL where both reading frames encode functional protein sequences (33 amino acids are discussed as a specific example of this overlap density). The 10:110:1 ratio of GAG to GAG-POL ensures the virus has significantly more structural proteins than enzymes, which is the optimal balance for assembly.

Envelope Protein Processing and Virus Assembly

  • ENV Protein (GP160): As a membrane protein, it is co-translationally inserted into the Endoplasmic Reticulum (ER).

    • Structure: GP41 is the transmembrane subunit; GP120 is the extracellular subunit located in the ER lumen.

    • Processing: GP120 is heavily glycosylated in the ER and moves through the Golgi apparatus.

    • Localization: ENV clusters at specific high-cholesterol membrane regions known as lipid rafts.

  • Assembly Requirements: Each new particle requires 22 copies of the full-length 9kb9\,\text{kb} viral RNA and approximately 2,0002,000 molecules of GAG/GAG-POL polyproteins.

  • RNA Packaging Specificity:

    • Packaging Signal (Psi, $\psi$): A sequence located upstream of the GAG start codon but downstream of the first splice donor. It is only present in full-length 9kb9\,\text{kb} RNAs.

    • GAG Binding: GAG proteins have a high-affinity binding site for the $\psi$ signal, ensuring only genomic RNA is packaged.

    • Dimerization: A palindromic dimerization signal near $\psi$ allows two copies of the RNA to base-pair together.

  • Budding Process: The GAG matrix domain binds to the cell membrane and the cytoplasmic tail of GP41. The nucleic acid domain binds the RNA. The ESCORT (Endosomal Sorting Complex Required for Transport) pathway is recruited to facilitate membrane budding.

Maturation and Accessory Proteins

  • Maturation: The budding particle is initially "immature" and non-infectious. Maturation requires the viral Protease.

    • Protease Activation: As GAG-POL proteins aggregate during budding, protease domains dimerize and cleave themselves from the polyprotein.

    • Cleavage and Rearrangement: Protease cleaves GAG into Matrix (MA), Capsid (CA), and Nucleocapsid (NC). MA remains under the membrane. NC remains bound to the RNA. CA rearranges into a conical core (fullerene cone).

  • Accessory Proteins (The Arms Race):

    • VIF (Virion Infectivity Factor): Inhibits the host enzyme APOBEC3G, which otherwise causes hypermutation (specifically CUC \rightarrow U transitions) in the viral genome. VIF induces the degradation of APOBEC3G via the proteasome.

    • VPU (Viral Protein U): Inhibits the host protein Tetherin, which tethers new virions to the cell surface to prevent release. VPU causes the degradation of Tetherin.

    • NEF (Negative Factor): Downregulates MHC class I molecules (to evade the immune system) and CD4 receptors (to prevent superinfection of the already infected cell).

  • Conclusion: Viruses typically "win" the evolutionary arms race because they replicate and evolve much faster than host cell proteins.