KS

Classification and Functions of Long Non-coding RNAs (lncRNAs) – Key Vocabulary

Definition and Scope of lncRNAs

  • Long non-coding RNAs (lncRNAs) = RNA transcripts conventionally defined by length > 200\;\text{nt}.

    • Threshold is arbitrary; some functional lncRNAs (e.g., BC1, snaR) fall at or slightly below 200\;\text{nt}.

    • Alternative biological definition (Amaral et al.): RNAs functioning as primary or spliced transcripts that are independent of any known small-ncRNA class.

  • Genome transcription landscape:

    • Protein-coding portion: small; non-coding (“dark matter”) transcription is vast.

    • NONCODE v3.0 (2012): 73{,}370 lncRNA entries from 1{,}239 species.

    • < 200 lncRNAs functionally annotated in lncRNAdb (2011).

  • Presence across taxa: animals, plants, yeast, prokaryotes, viruses.

  • General properties

    • Often low expression, tissue-specific, nuclear/chromatin localization.

    • Frequently 5'-capped, 3'-polyadenylated, multi-exonic ⇒ mRNA-like.

    • Sequence conservation: generally poor, yet selected subclasses (e.g., lincRNAs) show domain-level conservation.

Functional Diversity of lncRNAs

  • Documented roles in

    • Transcription regulation, splicing, translation, protein localization.

    • Chromatin architecture & epigenetic regulation.

    • Cellular processes: imprinting, cell-cycle, apoptosis, stem-cell pluripotency & reprogramming, heat-shock response.

    • Disease links: cancer progression, neurodegeneration, metabolic disorders, etc.

  • lncRNAdb survey: \sim42\% (of 182 curated entries) participate in transcriptional regulation.

Overview of Classification Frameworks

The review proposes four orthogonal features:

  1. Genomic location & context.

  2. Effect exerted on DNA (cis vs trans).

  3. Mechanism of functioning.

  4. Targeting mechanism/archetype.


1. Genomic Location & Context

Intergenic lncRNAs (lincRNAs)

  • Transcribed from regions between annotated protein-coding genes (Fig 1A).

  • Biological features

    • Transcriptionally activated like mRNAs; possess “K4–K36” chromatin signature (H3K4me3 at 5', H3K36me3 in body).

    • \approx70\% of K4–K36-positive lincRNAs show RNA evidence; close to \approx72\% for protein-coding genes.

    • \approx70\% of human K4–K36 lincRNA domains conserved in mouse (protein-coding ≈80\%).

    • More conserved than introns & antisense RNAs; more tissue-specific than mRNAs; more stable than intronic lncRNAs.

    • Functional spectrum: embryonic stem-cell pluripotency, cell proliferation, cancer progression, etc.

Intronic lncRNAs

  • Entirely originate from introns of protein-coding genes (Fig 1B).

  • Poorly characterised; only a minor subset functionally explored.

Sense lncRNAs

  • Transcribed from sense strand of protein-coding loci; may overlap exons or span entire gene (Fig 1C).

  • Often mRNA-like (polyA, 5' cap, multi-exonic).

  • Unusual cases encode peptides and act as RNAs:

    • SRA (Steroid Receptor RNA Activator) – encodes protein + RNA scaffold.

    • ENOD40 – encodes small peptides + guides RNP localisation in legumes.

Antisense lncRNAs

  • Transcribed from antisense strand of protein-coding loci; three sub-situations (Fig 1D):

    1. Exon–exon overlap with sense gene.

    2. Intronic transcript (no exon overlap).

    3. Cover entire sense gene via intronic overlap.

  • Validation: strand-specific assays, qRT-PCR, full-length cDNA sequencing.

  • Prevalence

    • \sim32\% of human lncRNAs antisense to coding genes.

    • \sim87\% of coding transcripts possess antisense partners in mouse.


2. Effects on DNA Sequences

lncRNAs localised to nucleus/chromatin frequently modulate DNA targets.

Cis-acting lncRNAs

  • Influence genes near their own locus.

  • Mechanisms

    • Transcriptional interference: RNA–DNA triplex or promoter occlusion blocks Pre-Initiation Complex (PIC).

    • DHFR upstream transcripts (0.8–7.3 kb): bind DHFR promoter, dissociate TFIIB.

    • SRG1 RNA (0.4–1.9 kb, yeast): covers SER3 promoter, prevents TF binding.

    • Chromatin modification recruitment

    • Xist (19 kb): recruits PRC2 ⇒ H3K27me3 ⇒ X-chromosome silencing.

    • MEG3 (~1.6 kb), COLDAIR (~1.1 kb, plants): recruit PRC2.

    • GAL10-ncRNA (~4 kb): recruits Rpd3S HDAC ⇒ histone de-acetylation.

    • HOTTIP (~3.8 kb): recruits MLL complex ⇒ active chromatin over HOXA.

Trans-acting lncRNAs

  • Act at distant loci; may not require sequence complementarity.

  • Examples

    • HOTAIR (~2.2 kb): transcribed from HOXC (chr12), targets HOXD (chr2) & other loci via Suz12; recruits PRC2/LSD1.

    • 7SK snRNA (~330 nt): scaffold for P-TEFb within 7SK snRNP; represses transcription elongation.

    • B2 SINE RNA: binds RNA Pol II, blocks elongation during heat-shock.


3. Mechanisms of Functioning

3.1 Transcriptional Regulation

  • Sub-mechanisms

    • Transcriptional interference (e.g., DHFR, SRG1, 7SK, B2).

    • Chromatin remodelling (Xist, MEG3, HOTAIR, HOTTIP, COLDAIR).

    • Enhancer-associated RNAs (eRNAs) that activate nearby genes:

    • ncRNA-a1, Evf-2, Alpha-250/Alpha-280.

3.2 Post-Transcriptional Regulation

Splicing Control
  • Binding/modulating splice factors or masking splice sites.

    • MIAT (9–10 kb): UACUAAC repeats bind SF1 ⇒ inhibit spliceosome assembly.

    • Malat1 (~7 kb): modulates phosphorylation pool & nuclear speckle distribution of SR proteins ⇒ alt-splicing.

    • LUST (1.4–2.4 kb): antisense to RBM5, proposed splice-masking.

Translational Control
  • Association with translation factors/ribosome.

    • BC1 & BC200: bind eIF4A, PABP ⇒ block 48S complex assembly.

    • snaR: ribosome-associated; function inferred.

    • Gadd7: associates with actively translating polysomes.

    • Splicing–translation coupling: Zeb2NAT retains intron to allow Zeb2 translation.

Additional Post-Transcriptional Modes
  • Natural antisense siRNA-like decay

    • 21A (~300 nt) & 1/2-sbsRNA1 (~0.7 kb) promote target mRNA degradation.

  • Competing endogenous RNAs (ceRNAs)

    • linc-MD1: sponges miR-133 & miR-135 ⇒ allows MAML1 & MEF2C expression during myogenesis.

    • IPS1 (plant), HULC (liver cancer) act as target mimics.

    • Pseudogene ceRNAs: KRASP1, PTENP1.

    • BACE1AS (~2 kb): antisense duplex masks miR-485-5p site ⇒ stabilises BACE1 mRNA.

3.3 Other Functional Mechanisms

  • Protein localisation: meiRNA positions Mei2; ENOD40 guides RNP granules.

  • Telomere maintenance: TERC is RNA template within telomerase.

  • RNA interference modulation: rncs-1 reduces Dicer activity.

  • Cellular architecture: MENε/β (NEAT1) scaffolds paraspeckles; Xlsirts & VegT RNAs in Xenopus oocytes.


4. Targeting Mechanisms / Archetypes

Wang & Chang (2011) categories:

  1. Signal – expression marks cellular events (e.g., Xist, COLDAIR).

  2. Decoy – sequesters proteins or RNAs (e.g., DHFR upstream RNA, PANDA, ceRNAs).

  3. Guide – directs effector complexes to targets (e.g., HOTAIR, Xist).

  4. Scaffold – structural platform for multi-protein assembly (e.g., HOTAIR, 7SL).

Notes

  • One lncRNA can combine archetypes (HOTAIR = signal + guide + scaffold).

  • Interaction modalities: RNA–RNA, RNA–DNA hybrids, RNA secondary/tertiary structure, protein linkers.


Biological Highlights of Major Classes

lincRNAs

  • High PRC2 affinity subset; potential species-specific chromatin programmes.

  • Rich dataset for evolutionary comparisons owing to K4–K36 domains.

Sense & Antisense lncRNAs

  • Antisense: dominant among PRC2-bound RNAs; pervasive regulatory layer.

  • Sense: occasional coding potential challenges classic gene annotation dichotomy.

Intronic lncRNAs

  • Under-studied; may possess unique poly(A) status & subcellular localisation.


Statistics, Databases & Experimental Evidence

  • NONCODE v3.0: 73{,}370 lncRNAs / 1{,}239 species.

  • GENCODE v7: detailed manual curation; antisense ≈32\% of human lncRNAs.

  • Experimental validations: CAGE, strand-specific RNA-seq, qRT-PCR, 5'/3' RACE.


Length-Based Subclassification (Perspective)

  • Analysis of NONCODE v3.0 suggests trimodal distribution ⇒ proposal:

    1. Small-lncRNA: 200\text{–}950\;\text{nt} (human ≈ 58\%).

    2. Medium-lncRNA: 950\text{–}4{,}800\;\text{nt} (mouse ≈ 78\%).

    3. Large-lncRNA: >4{,}800\;\text{nt} (human enriched vs mouse).

  • Requires validation with higher-confidence annotations (GENCODE smaller set).


Interplay with Small ncRNAs

  • Many lncRNAs harbor miRNA or snoRNA genes:

    • H19 exon contains miR-675; LOC554202 hosts miR-31.

    • MEG3/8/9, Rian, antiPeg11 encode clusters of miRNAs & snoRNAs; potential coordinated targeting.

  • Post-processing: some lncRNAs preferentially processed into snoRNAs.

  • Regulatory networks become multilayered: miRNA ↔ lncRNA ↔ mRNA.


Future Directions Highlighted by Authors

  • Continuous refinement of classification as knowledge grows.

  • Focused study on chromatin-modifying lncRNA groups beyond PRC2 (e.g., SETD, HDAC variants).

  • Expand functional annotation of intronic & sense lncRNAs.

  • Evolutionary linkage between lncRNAs and small ncRNAs (shared loci, precursor relationships).

  • Integrative analyses (e.g., combined miRNA/mRNA/lncRNA in diseases such as NSCLC) to decode complex regulatory networks.


Representative lncRNAs by Mechanism (non-exhaustive)

  • Transcriptional interference (cis): DHFR-up, SRG1, 7SK, B2.

  • Chromatin modification (cis/trans): Xist, COLDAIR, MEG3, HOTTIP, HOTAIR, GAL10-ncRNA.

  • Enhancer RNAs: ncRNA-a1, Evf-2, Alpha-250.

  • Splicing regulators: MIAT, Malat1, LUST, Zeb2NAT.

  • Translation regulators: BC1, BC200, snaR, Gadd7.

  • ceRNAs / miRNA sponges: linc-MD1, IPS1, HULC, PTENP1, KRASP1.

  • Protein localization: meiRNA, ENOD40.

  • Telomere template: TERC.

  • Dicer modulation: rncs-1.


Key Complexes & Factors Mentioned

  • PRC2, PRC1; Rpd3S HDAC; MLL; LSD1; P-TEFb; RNA Pol II; TFIIB; SF1; SR proteins; eIF4A; PABP; Dicer.


Ethical, Practical & Philosophical Implications

  • lncRNA research reshapes gene definition (e.g., coding vs non-coding boundaries).

  • Potential diagnostic/prognostic biomarkers (PCAT-1 in prostate cancer, HULC in liver cancer).

  • Therapeutic avenues: antisense targeting, modulation of ceRNA networks, chromatin-binding interference.