CH 17: Non-Coding RNA

Overview of RNA Types

mRNA is the only coding RNA, carrying genetic information from DNA to ribosomes for protein synthesis. This process is known as translation.
Pseudogenes: These are non-functional segments of DNA that resemble functional genes but have lost their protein-coding ability due to mutations (e.g., frameshift mutations, premature stop codons, or loss of regulatory elements like promoters). They are duplicated versions of genes that cannot be translated into functional proteins.

Non-Coding RNAs (ncRNAs)

During Transcription:
- Long non-coding RNAs (lncRNAs) play a crucial role in controlling gene expression by interacting with chromatin-modifying enzymes, transcription factors, and components of the basal transcription machinery. They can act in cis (affecting nearby genes) or trans (affecting distant genes).
During Translation:
- Small RNAs, including microRNAs (miRNAs) and small interfering RNAs (siRNAs), silence the translation of specific mRNAs and affect their stability. They achieve this primarily by guiding the RNA-induced silencing complex (RISC) to target mRNAs, leading to translational repression or mRNA degradation.
Some genes do not encode polypeptides directly but are instead transcribed into ncRNAs. These ncRNAs perform a wide array of functions beyond simply carrying genetic code.
- Estimates of ncRNAs in humans range from a few thousand to tens of thousands, highlighting their significant contribution to the transcriptome.
ncRNAs perform a variety of essential cellular functions across all domains of life: bacteria, archaea, and eukaryotes. These functions include structural roles (e.g., rRNAs in ribosomes, tRNAs in translation), catalytic roles (e.g., ribozymes), and regulatory roles (e.g., miRNAs, lncRNAs).
- In many cell types, ncRNAs are more abundant than mRNAs (ncRNA > mRNA), indicating their widespread influence on cellular processes.
- In a typical human cell, about 20% of transcription results in mRNA while approximately 80% involves ncRNA production, underscoring the dominance of ncRNA transcription.

Types of Non-Coding RNAs

Categorization: ncRNAs are classified based on their length:
- lncRNAs: Defined as non-coding transcripts longer than 200 nucleotides. They exhibit diverse secondary structures and can regulate gene expression at transcriptional, post-transcriptional, and epigenetic levels.
- Small regulatory RNAs (short ncRNAs): Transcripts shorter than 200 nucleotides, playing specialized regulatory roles.
  - miRNAs: Typically 20-25 nucleotides in length, these endogenous RNAs regulate gene expression post-transcriptionally. They are transcribed as primary miRNAs (pri-miRNAs), processed by Drosha and DGCR8 into precursor miRNAs (pre-miRNAs), and then exported to the cytoplasm for final processing by Dicer into mature miRNAs. They work by binding to mRNA targets with imperfect complementarity, leading to translational repression or deadenylation.
  - siRNAs: These are typically 20-25 nucleotides double-stranded RNAs synthesized from various exogenous and endogenous sources, such as viral RNAs, transposable elements, or intentionally introduced experimental constructs. They typically bind to target mRNAs with perfect complementarity, leading to direct mRNA cleavage by the RISC complex.
Major player in gene expression:
- RNA Polymerase I (RNApolIRNApolI): exclusively synthesizes ribosomal RNAs (rRNAs), except the 5S rRNA.
- RNA Polymerase II (RNApolIIRNApolII): primarily synthesizes messenger RNAs (mRNAs), but also synthesizes lncRNAs and precursor microRNAs (pri-miRNAs).
- RNA Polymerase III (RNApolIIIRNApolIII): synthesizes transfer RNAs (tRNAs), 5S rRNA, and other small RNAs like U6 snRNA.
Role of miRNAs and siRNAs:
- miRNAs recognize target mRNAs through sequence complementarity, usually imperfect, in the 3' untranslated region (UTR) and bind to disrupt their expression by inhibiting translation or promoting mRNA decay.
- siRNAs are acquired externally (e.g., from invading viruses) or derived from endogenous sources (e.g., long dsRNA precursors) and participate in gene regulation mechanisms, largely by directing sequence-specific mRNA degradation.
piRNAs (Piwi-interacting RNAs): These are 24-31 nucleotide long ncRNAs that are abundant in germline cells. They protect genomes from the uncontrolled movement of transposable elements and maintain genome stability by guiding PIWI proteins to target RNA or DNA for silencing, often through epigenetic modifications.

Functions and Roles of Non-Coding RNAs

Scaffolds: ncRNAs can serve as molecular scaffolds, providing binding sites to facilitate interactions among two or more proteins or protein complexes that typically do not interact directly. For example, the lncRNA XIST acts as a scaffold to recruit chromatin-modifying enzymes to one X chromosome, initiating X-inactivation.
Guides: ncRNAs assist proteins in locating specific target sites within the cell, often through sequence-specific complementarity. For instance, guiding chromatin remodeling complexes to particular genomic loci.
Alteration of Protein Function: ncRNAs can bind directly to proteins, modifying their structure, stability, or activity. This can involve allosteric regulation, preventing protein-protein interactions, or blocking active sites.
Blockers: ncRNAs can inhibit particular cellular processes by physically obstructing them. For example, some ncRNAs can bind to ribosomal binding sites on mRNA, thus blocking translation initiation.
Decoys: ncRNAs can bind to and sequester other regulatory molecules (such as transcription factors, RNA-binding proteins, or other ncRNAs like miRNAs), thus regulating their activity and availability. This is a common mechanism for competitive endogenous RNAs (ceRNAs).

Chromatin Structure and Non-Coding RNAs

Chromatin can exist in two main states: closed (inactive/heterochromatin) or open (active/euchromatin). These states are dictated by epigenetic modifications.
- Active chromatin: characterized by unmethylated cytosines in CpG islands and acetylated histones. Histone acetylation, mediated by histone acetyltransferases (HATs), neutralizes the positive charge of histones, loosening DNA-histone interactions and making DNA more accessible.
- Silent chromatin: characterized by methylated cytosines (catalyzed by DNA methyltransferases - DNMTs) and deacetylated histones. Histone deacetylation, mediated by histone deacetylases (HDACs), increases the positive charge of histones, leading to tighter DNA-histone interactions and condensed chromatin, making DNA inaccessible.
lncRNAs can influence chromatin structure and transcription regulation through various mechanisms:
- Interaction with chromatin remodeling proteins: lncRNAs can recruit specific chromatin-modifying complexes (e.g., Polycomb Repressive Complex 2 (PRC2) or LSD1-CoREST complex) and localize them to specific genes, leading to changes in histone marks and DNA accessibility.
An example of lncRNA activities: Hox transcript antisense intergenic RNA (HOTAIR)
- HOTAIR is a 2.2-kb long ncRNA located on human chromosome 12, transcribed from the HOXC gene cluster. It is known to be aberrantly expressed in various cancers.
- Functions as a scaffold to recruit two distinct histone-modifying complexes: the PRC2 complex (which methylates histone H3 at lysine 27, H3K27me3) and the LSD1-CoREST complex (which demethylates histone H3 at lysine 4, H3K4me2). This dual recruitment to target genes facilitates potent transcriptional repression.
- Binds to a GA-rich region adjacent to a target gene, transitioning the chromatin state from an open, transcriptionally active conformation to a closed, silent one.

RNA Interference and Gene Silencing Mechanisms

Double-Stranded RNA (dsRNA) can potently inhibit gene expression:
- Studies in the nematode C. elegans by Andrew Fire and Craig Mello in 1998 showed that injections of antisense RNA could silence gene expression, but subsequent injections of dsRNA were dramatically more potent (up to 1010 to 100100 times more effective) than antisense RNA alone. This seminal discovery led to the elucidation of the mechanism of RNA interference (RNAi), for which they received the Nobel Prize.
RNA in situ Hybridization: This is a molecular technique used to detect and localize specific RNA sequences (mRNAs, ncRNAs) within cells and tissues while preserving overall tissue architecture. It typically employs a single-stranded antisense RNA probe, often modified with a digoxigenin-tagged U base, or fluorescent labels. The hybridized probe is then detected using antibodies linked to enzymes (e.g., alkaline phosphatase) that produce a colored precipitate or fluorescent signals.
miRNA and siRNA involvement in RNAi:
- Both miRNAs and siRNAs are key mediators of RNAi, inducing gene silencing by causing sequence-specific degradation or inhibition of translation of target mRNAs.
- In humans, more than 2,000 genes encoding miRNAs have been identified, underscoring their vast regulatory network.
Mechanism of RNAi:
1. miRNA Biogenesis: pri-miRNAs (primary miRNAs), transcribed by RNApolIIRNApolII, form hairpin structures. In the nucleus, the microprocessor complex (Drosha and DGCR8) cleaves pri-miRNAs into ~70-nt pre-miRNAs (precursor miRNAs).
2. Nuclear Export: pre-miRNAs are then actively exported from the nucleus to the cytoplasm by Exportin-5.
3. Dicer Processing: In the cytoplasm, the enzyme Dicer (an RNase III type enzyme), with the help of accessory proteins, cleaves the pre-miRNA hairpin into a mature double-stranded RNA duplex of ~20-25 nucleotides, typically with 2-nucleotide 3' overhangs.
4. RISC Assembly and Action: One strand of the dsRNA duplex (the guide strand) is loaded into the RNA-induced silencing complex (RISC), which contains an Argonaute (Ago) protein as its catalytic core. The passenger strand is usually degraded. The RISC complex then guides the Ago protein to target mRNAs based on sequence complementarity.
  - Actionable outcomes for RISC-bound targets:
    - Translation inhibition: If the guide RNA-target mRNA complementarity is imperfect (typical for miRNAs), RISC can inhibit protein synthesis by blocking ribosomal access, promoting premature ribosome dissociation, or inducing mRNA deadenylation and decapping, leading to decay.
    - mRNA degradation: If there is near-perfect complementarity (typical for siRNAs), the Ago protein, acting as an endonuclease (slicer activity), cleaves the target mRNA directly, leading to its rapid degradation.
5. RITS Complex (RNA-induced Transcriptional Silencing): This complex functions similarly to RISC but operates in the nucleus. It is guided by small RNAs to target specific DNA loci, leading to heterochromatin formation, histone modifications (e.g., methylation), and DNA methylation, thereby directing transcriptional repression of genes or transposable elements.

Role of Decoy Non-Coding RNAs

Decoys, often acting as competitive endogenous RNAs (ceRNAs), can sequester miRNAs from their target mRNAs. By binding to miRNAs through complementary sequences, they effectively reduce the concentration of free miRNAs available to silence a specific set of target genes, thereby blocking RNAi processes and upregulating gene expression.
These decoy ncRNAs function as crucial regulators in diverse biological processes, including fine-tuning gene expression, defending against viral infections by sponging viral-targeting miRNAs, and silencing transposable elements.

Non-Coding RNAs in Genome Defense

Protection against transposable elements: This is primarily mediated through piRNA engagement with PIWI proteins, especially prominent in the germline. PIWI proteins, guided by piRNAs, recognize and silence active transposable elements at both transcriptional (via epigenetic modifications) and post-transcriptional levels (via mRNA degradation), ensuring genome stability.
CRISPR-Cas system in prokaryotes: This is an adaptive immune system in bacteria and archaea that protects against invading phages (viruses) and plasmids/transposons.
- It involves three distinct phases:
  1. Adaptation (Spacer Acquisition): Upon infection, short fragments of foreign DNA (protospacers) from invading phages or plasmids are recognized and integrated into the host's CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) array, located between direct repeat sequences. This process is mediated by Cas1 and Cas2 proteins.
  2. Expression (crRNA Biogenesis): The CRISPR array is transcribed into a long single RNA molecule called pre-crRNA. This pre-crRNA is then processed (often with the help of a tracrRNA in Type II systems) into shorter CRISPR RNAs (crRNAs), each containing a unique spacer sequence corresponding to a foreign DNA element.
  3. Interference (Target Silencing): The mature crRNA, complexed with Cas proteins (e.g., Cas9 in Type II systems), acts as a guide. The crRNA directs the Cas protein to locate and bind to complementary target DNA sequences in subsequent invasions. Upon recognition, the Cas protein, acting as an endonuclease, cleaves the foreign DNA (which must also contain a Protospacer Adjacent Motif, PAM, sequence), thereby preventing its replication and protecting the cell.