1/81
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
scRNA-seq captures a static snapshot of gene expression
making it difficult to study time-resolved processes like embryogenesis or tissue regeneration
differentiation occurs on timescales of hours to days,
which is comparable to the typical mRNA half-life
Therefore, the relative abundance of nascent (unspliced) vs. mature (spliced) mRNA encodes information about the direction and rate of transcriptional change
without needing metabolic labeling or time-course experiments
RNA velocity
the time derivative of the gene expression state; a high-dimensional vector predicting the future state of individual cells on a timescale of hours
All common scRNA-seq protocols use
oligo-dT primers to enrich for polyadenylated mRNA
Despite this, examining SMART-seq2, STRT/C1, inDrop, and 10x Chromium datasets
15–25% of reads contained unspliced intronic sequences
Origins of intronic reads:
Mostly from secondary priming positions within intronic regions
In 10x Chromium: also abundant discordant priming from intronic polyT sequences,
possibly generated during PCR amplification from first-strand cDNA
Correlation of intronic with exonic counts
these represent unspliced precursor mRNAs
Metabolic labeling validation: 4sU labeling of HEK293 cells (5, 15, 30 min exposure) followed by oligo-dT-primed STRT sequencing
83% of genes showed expression time courses consistent with simple first-order kinetics, confirming unspliced reads = nascent mRNA
ds/dt = βu − γs
α = transcription rate (production of unspliced mRNA)
β = splicing rate (unspliced → spliced; set to 1 by convention)
γ = degradation rate of spliced mRNA
u = unspliced mRNA count
s = spliced mRNA count
RNA velocity = ds/dt =
the first time derivative of spliced mRNA abundance
When α is constant,
system approaches steady state asymptotically
At steady state: u = γs
fixed-slope relationship in phase portrait space
γ captures:
degradation rate, splicing rate, ratio of intronic/exonic lengths, number of internal priming sites
u > γs
gene is being induced → positive velocity → cell moving toward higher expression
u < γs
gene is being repressed → negative velocity → cell moving toward lower expression
Velocity = 0
at steady state (cells on the diagonal)
Key Property of γ
Examined across Tabula Muris dataset (48 cell types, 8,385 genes)
Most genes show a single fixed γ
across widely different cell types and tissues (high Pearson correlation of spliced/unspliced counts across cell types)
~11% of genes show distinct slopes in different tissue subsets, suggesting
tissue-specific alternative splicing (Alt. splicing confirmed for Sept9, Sgk1) or tissue-specific degradation rates
Double-γ genes estimated by regression
mixture modeling (EM, R package flexmix)
γ fit performed using regression on extreme expression quantiles
(robust even when most cells are outside steady state)
Alternative: structure-based γ fit — uses gene structural parameters to predict γ, then uses k most correlated genes to adjust M-value (M = log₂[u_observed / u_steady-state]) and re-estimate γ
corrects for genes exclusively observed far from equilibrium
For large/noisy datasets, read counts pooled across k nearest cell neighbors
before γ estimation to reduce technical noise
Different k values used per dataset
(e.g., k=500 for hippocampus, k=90 for oligodendrocytes)
Joint PCA embedding:
observed and extrapolated states jointly embedded in PCA space; arrows show direction of travel
Projection onto existing embeddings (t-SNE):
extrapolated state compared to similarity with neighboring cells; velocity arrows reflect where a cell would most likely go
Locally averaged vector fields (grid visualization)
for large datasets; Gaussian smoothing on a regular grid
Cells can have RNA velocities along
multiple independent components simultaneously (differentiation, maturation, proliferation)
A cell with no apparent velocity in one embedding may still have substantial velocity
in an unvisualized subspace
Validation 1: Mouse Liver Circadian Cycle (Bulk RNA-seq)
Examined bulk RNA-seq time course of mouse liver circadian cycle (24h)
Unspliced mRNA at each time point consistently more similar to
spliced mRNA of the next time point
Circadian genes showed expected excess of unspliced RNA during
upregulation, deficit during downregulation
Solving differential equations for each gene
velocity arrows on PCA correctly captured expected direction of circadian progression
Validation 2: Mouse Chromaffin Cell Differentiation (SMART-seq2)
System: Schwann cell precursors (SCPs) differentiating into chromaffin cells (neuroendocrine cells of adrenal medulla); direction of differentiation can be validated by lineage tracing
Phase portraits showed expected deviations from steady-state in many genes
Serpine2: repressed in SCPs → below steady-state line → negative velocity
Chga: induced in chromaffin cells → above steady-state line → positive velocity
Velocity vectors correctly showed general movement
toward chromaffin fate
Also captured:
movement toward and away from intermediate bridge state
Captured cell cycle dynamics involved in chromaffin
differentiation (in PCA and dedicated cell-cycle gene analysis)
Metabolic labeling:
spliced/unspliced ratio changes detectable after 10–100 minutes (τ distribution, Fig. 2g mode at ~10–20 min for most genes)
Based on EdU pulse labeling of chromaffin progenitor cells:
effective extrapolation 2.5–3.8 hours into the future
Consistent with ability to resolve
cell-cycle events
Predictive timescale depends on
curvature of the expression manifold (linear extrapolation)
Longer timescales accessible by tracing a sequence of small
extrapolation steps on the observed manifold
Application 1: Developing Mouse Hippocampus (10x Chromium)
Cell types identified (by known marker genes):
Radial glia (Hes1+, Hopx+) — identified as root/origin
Neuroblasts → three neuronal lineages:
Dentate gyrus granule neurons
Pyramidal neurons: Subiculum, CA1, CA2, CA3, Hilus (five fields)
Astrocytes (Aqp4+)
Oligodendrocyte precursors (OPCs, Pdgfra+)
Markov random walk model on velocity field
automatically identified root (radial glia) and terminal states (all lineage endpoints)
Confirmed fate mapping showing radial glia as true origin
of hippocampal lineage tree
Two paths from radial glia: direct
astrocytes (no cell division) OR → pre-OPC → narrow passage → OPCs
Narrow passage
moment of commitment to oligodendrocyte lineage
A cell in the narrow passage
overwhelmingly likely to become OPC
A cell in pre-OPC state: as likely to remain in pre-OPC state
true commitment requires passing through the narrow passage
Fate choice at this level is non-deterministic
tilting of gene expression, followed by lock-in of final fate via TF feedback loops
Neuronal fate choice example:
Two adjacent neuroblasts at branching point between CA and granule fates
Current gene expression states are neighbors
similar transcriptomes
But their futures are already tilted toward different fates
distinguished by activation of Prox1
Consistent with known biology: Prox1 required for granule neuron formation; Prox1 deletion
neuroblasts adopt pyramidal neuron fate instead
Granule neurons of dentate gyrus first split
from hippocampus proper
Second split: subiculum/CA1 vs. CA2–4
consistent with major anatomical subdivisions
Pdgfra (OPC marker):
positive velocity in pre-OPCs, neutral in OPCs (induction then maintenance)
Igfbpl1 (neuroblast):
positive velocity from radial glia to neuroblasts, negative going from neuroblasts to main neuronal branches
Application 2: Human Embryonic Forebrain (10x Chromium)
System: radial glia → neuroblast → immature neuron → neuron (glutamatergic, expressing SLC17A7/VGLUT1)
Strong velocity pattern originating from
proliferating progenitor state (radial glia)
Proceeding through intermediate neuroblast stages
to mature glutamatergic neurons
Pseudotime ordering by principal curve
(using velocity field to determine direction)
Layered spatial expression in tissue corresponded closely to
pseudotemporal distribution in scRNA-seq data
Confirmed
unspliced mRNAs consistently precede spliced mRNAs during both up- and down-regulation
Observed both fast and slow kinetics:
Fast: RNASEH2B — little difference between unspliced and spliced RNAs
Slow (burst + overshoot): DCX, ELAVL4, STMN2 — initial rapid burst of transcription, then reduced sustained level; spliced transcripts follow with a noticeably delayed trajectory
"Dynamic induction with overshooting" proposed to help quickly induce genes with slow degradation kinetics
first time demonstrated in human embryos
Velocity toward/from stem cell states was detectable for a limited set of genes (e.g., Lgr5)
but genome-wide stem cell dynamics were confounded by cell cycle.
Velocity estimates robust to:
Variation of model parameters
Gene subsampling
Cell subsampling
Most sensitive parameter
size of the neighborhood used in visualization of velocity in pre-defined embeddings
75% of differentially expressed genes along pseudotime trajectories showed positive correlation
between velocity estimates and empirically observed expression derivatives
Velocity coordination metric:
genes well-correlated in spliced expression also show correlated velocity estimates → serves as unbiased quality measure
Genes observed exclusively far from equilibrium
(→ structure-based γ correction helps)
Uneven contribution of
non-coding transcripts
Alternative splicing leading to multiple γ
rates across measured populations
No metabolic labeling or time course required
velocity inferred from a single static snapshot
Platform-agnostic
works on SMART-seq2, STRT/C1, inDrop, and 10x Chromium
Tree orientation without prior knowledge
root and terminal states identified automatically via Markov random walk on velocity field
Single-cell resolution fate decisions
reveals early tilting of fate at individual cell level, distinguishing neighbors already committed to different fates
Manifold learning algorithms that simultaneously fit a manifold and the kinetics on that manifold
based on RNA velocity; already applied to whole-organism development