1/101
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Understanding molecular programs guiding differentiation/dedifferentiation
is a major challenge
Bulk analysis can't resolve two problems simultaneously
discovering cell classes AND tracing development of each class
scRNA-seq destroys cells during profiling
cannot follow the same cell across time
Existing trajectory methods largely designed for stationary processes (e.g., adult stem cell niches)
not true time-courses
Key limitations of prior methods:
Most don't leverage temporal information
Graph-theoretic models impose hard constraints (1D edges, 0D branch points) — can't capture gradual fate divergence
Most don't account for cellular growth and death
Model a differentiating cell population as a time-varying probability distribution
ℙt in gene expression space
Sample this distribution at multiple time points and infer
how it evolves
Key assumption: over short time scales, cells move short distances in expression space
use Optimal Transport (OT) to infer how mass (cells) is redistributed between time points
Historical Origin of OT
Originally developed by Monge (1781) to redistribute earth for fortifications with minimal work
OT implicitly assumes a cell's fate depends only on its current state,
not history (Markov process)
Captures only time-varying components of the distribution
(not applicable to systems in equilibrium, where ℙt is constant)
OT calculates couplings between
consecutive time points
Longer-interval couplings are inferred by
composing adjacent transport maps (chain rule)
Descendant distribution: given a cell set C at time tᵢ, transport it forward
mass distribution over cells at tᵢ₊₁
Ancestor distribution:
"rewind" the coupling backward in time → mass distribution at tᵢ₋₁
Shared ancestry
convergence of ancestor distributions from two different cell sets reveals a common origin
A full trajectory
sequence of descendant or ancestor distributions across all time points
Local model
identify TFs enriched in cells with many vs. few descendants in a target population
Global model
modules of TFs → modules of target genes; predict gene signature expression at later time points from TF expression at earlier ones
Three time lags tested: 6hr, 48hr, 96h
to capture regulation on different timescales
Force-Directed Layout Embedding (FLE)
better captures global structure than t-SNE
Pipeline: 100-dimensional diffusion components → 20 nearest neighbors
ForceAtlas2 algorithm (Gephi)
Initial growth rates estimated from
proliferation and apoptosis gene signatures
Sigmoid function maps expression scores
to birth rates (βMIN = 0.3, βMAX = 1.7)
Doubling times range from
~9.6 hours (fast) to ~55 hours (slow)
Growth rates are then iteratively refined
by the OT computation (growth_iters = 3)
System:
Secondary MEFs from single E13.5 female embryo
Dox-inducible polycistronic OKSM cassette (Oct4/Klf4/Sox2/Myc)
Oct4-IRES-EGFP reporter for successful reprogramming
Phase 1 (Dox): days 0–8
Phase 2: either serum-free 2i medium or continued serum; Dox withdrawn at day 8
Oct4-EGFP+ cells emerge ~day 10
Major cell populations identified (by gene signature):
Pluripotent-like (iPSCs)
Epithelial-like
Trophoblast-like
Neural-like
Stromal-like
MET (mesenchymal-to-epithelial transition) region
Key early divergence: Stromal ancestors diverge from all others as early as day 1.5, sharpening over following days.
All other populations (iPSC, trophoblast, neural) remain indistinguishable until after day 8 (post-Dox withdrawal), when cells undergo MET.
Geodesic interpolation approach:
Given time points t₁ < t₂ < t₃, use OT to predict distribution at t₂ by interpolating between t₁ and t₃
OT performance is roughly as good as
batch-to-batch variation (near-baseline performance)
Quality degrades slightly over longer intervals
(2-day vs. 1-day gaps)
Robustness testing: Results stable across:
Down to 200 cells/batch
Down to 1,000 UMIs/cell
Wide range of βMAX, βMIN, δMAX, δMIN
Entropy regularization ε from 5×10⁻⁵ to 0.5
Unbalanced transport λ from 0.1 to 32
Stromal Region (SR):
Shows ECM rearrangement, senescence, cell cycle inhibitors, secretory phenotype (SASP)
Does NOT simply reflect "MEF reversion"
enriched for neonatal muscle and skin signatures (20–30×)
Peaks in abundance days 10.5–11
then declines due to low proliferation and apoptosis
Genomic aberrations: 2.1% whole-chromosome aneuploidy
3.2% sub-chromosomal (vs. 1.1%/1.2% baseline)
Frequent amplification of region containing Cdkn2a, Cdkn2b, Cdkn2c (cell cycle inhibitors)
Frequent loss of Cdk13 (promotes cycling) and Mapk9 (loss promotes apoptosis)
MET Region:
Increased proliferation, loss of fibroblast identity
OKSM transgene expression explains
~50% of variance in fate ratio by day 2, ~75% by day 5
Shisa8
most differentially expressed gene at day 1.5; expressed in 50% of top-quartile MET cells vs. 5% in bottom quartile; mammalian-specific Shisa family transmembrane adaptor
Fut9
synthesizes SSEA-1 glycoantigen; known MET marker
Id3 association with stromal fate is surprising (forced expression increases reprogramming efficiency)
possibly acts by enhancing paracrine signals from stromal cells to iPSCs
Nfix
represses embryonic expression programs; Nfic/Prrx1: associated with mesenchymal programs
Bottleneck: iPSC trajectory encompasses ~40% of cells at day 8.5, but only ~10% at day 10 (2i) and ~1% at day 11 (serum)
only a small, distinct subset of MET cells can become iPSCs
iPSC progenitors reside along thin "strings" in the FLE;
have not yet acquired full pluripotency signature but are transitioning rapidly
Day 11.5–12.5
Nanog, Zfp42, Dppa4, Esrrb, elevated cell cycle signature
Day 11.5
iPSC-like cells = 12% of cells
Days 15–18
iPSC-like cells = 80–90%
In serum
process delayed and less efficient; 3.5% by day 12.5, peaks at 10–15%
2-cell (2C) stage signature
~1% of iPSCs in both conditions
Wave 1 (days 9–10):
Nanog, Sox2, Mybl2, Elf3, Tgif1, Klf2, Etv5, Cdc5l, Klf4, Esrrb, Spic, Zfp42, Hesx1, Msc
Wave 2 (days 12–14)
Obox6, Sohlh2, Ddit3, Bhlhe40
Obox6 and Sohlh2
not expressed in any other fate trajectory; roles in germ cell maintenance/survival; not previously implicated in pluripotency
X-chromosome reactivation
Xist downregulated
Pluripotency-associated proteins expressed
X-chromosome reactivated
Trajectory 3: Trophoblast-like Cells
Emerge from MET Region after day 8; gain strong epithelial signature by day 9; trophoblast signature by day 10.5
Peak
at day 12.5 (~20% of all cells)
Remarkable diversity of subtypes
previously only some trophoblast genes had been noted but coherent cell types not identified
Trophoblast progenitors (TPs)
found in both 2i and serum
Spongiotrophoblasts (SpTBs)
both conditions
Spiral artery trophoblast giant cells (SpA-TGCs)
serum only
Labyrinthine trophoblasts (LaTBs)
~200 cells in 2i only
Primitive endoderm / XEN-like cells
181 cells from single collection
4.0% whole-chromosome aneuploidy
(vs. 1.1% baseline)
Recurrent sub-chromosomal amplification (8.6% of trophoblasts)
74-gene region on chr. 15 containing Wnt7b, Prr5, and "core trophoblast genes"
Prolactin gene cluster amplification on
chr. 13 in 1% of cells
Trajectory 4: Neural-like Cells (serum only, not 2i)
Form a prominent "spike" in the FLE under serum conditions
Ancestors diverge from trophoblast and iPSC ancestors by day 9
Rapid transition at day 12.5: lose epithelial signatures, gain neural signatures
Radial glial cells (RGCs)
appear first, concurrent with loss of epithelial identity at day 12.5
Three types of radial glial cells
Id3-RGC, Gdf10-RGC, Neurog2-RGC
Mature neurons and glia
emerge day 14 onwards; ancestors concentrated in Gdf10-RGCs at day 13.5
Why neural cells absent in 2i
MEK inhibitor in 2i medium blocks Cntfr signaling required for neural differentiation
Genomic aberrations
Very low — 0.4% sub-chromosomal aberrations
TFs predictive of neural fate:
Early neurogenesis: Rarb, Foxp2, Emx1, Pou3f2, Nr2f1, Myt1l, Neurod4
Late neurogenesis: Scrt2, Nhlh2, Pou2f2
Neural subtype survival: Onecut1, Tal2, Barhl1, Pitx2
Neural tube formation: Msx1, Msx3
Paracrine Signaling Analysis
Collected 415 ligand genes (cytokines, growth factors, hormones from GO) + 2,335 receptor genes
Identified 580 ligand-receptor pairs from curated mouse protein-protein interactions
Interaction score = fraction of cells expressing ligand X in set A × fraction expressing receptor Y in set B
Standardized against 10,000 permutation-based null distributions
Neural cells upregulate Cntfr one day before
neural-like cells appear (day 11.5)
Stromal cells begin expressing Cntfr ligands
(Crlf1, Lif, Clcf1) at day 10.5
In 2i
same ligand-receptor pairs seen but MEK inhibitor blocks downstream Cntfr signaling
Obox6
homeobox gene normally expressed in oocyte, zygote, early embryos, ESCs; unknown function in reprogramming
Expressed in <1% of cells before day 12
94% of Obox6+ cells biased toward MET fate
Test
Dox-inducible lentivirus (Obox6 or Zfp42 as positive control, or empty vector) in secondary MEFs, days 0–8
Result
Both Obox6 and Zfp42 increased reprogramming efficiency ~2-fold in 2i, even more in serum
Confirmed in primary MEFs
(independent experiment)
Gdf9 × Tdgf1
highest paracrine interaction score for iPSC lineage
Tdgf1 known to maintain pluripotency
role in establishing pluripotency not previously reported
Prior attempts to boost reprogramming with
GDF9 at days 0–2 failed
New approach:
recombinant mouse GDF9 added daily starting day 8 (when interaction score begins rising)
Result
GDF9 increased reprogramming efficiency 4–5 fold at highest dose (1 μg/ml) in serum
Confirmed by (i) Oct4-EGFP+ colony counting, (ii) bulk RNA-seq, (iii) scRNA-seq
Dose-dependent, confirmed in multiple independent experiments
GDF9 also increased fraction of neural fate cells
competitive relationship with iPSCs; TGFβ superfamily role in neural lineage specification warrants further study
Monocle2
Day 18 cells give rise to Day 8 cells which give rise to Day 4 cells (completely reversed); cannot distinguish iPSC, neural, and trophoblast fates
URD
Trophoblast lineage specified by day 0.5; neural/iPS arise from stromal cells (biologically implausible); >85% of cells from days 4–8 unassigned
FateID
Fates appear divergent from day 0 (no shared ancestry); trajectories skip time points
AGA
Day 0 cells directly connected to days 14–18 stromal cells; extensive stromal→iPSC transitions inferred
STITCH
iPSCs largely arising from stromal region (ignores differential growth rates)
scDiff/GPfates:
Trajectories to incoherent destinations (mixtures of very different cell types)
Category 2:
Not incorporating time information → temporal inconsistencies
Category 3
Not modeling differential cell growth rates → apoptotic stromal cells incorrectly inferred as iPSC ancestors
Probabilistic, distributional view of trajectories
not deterministic paths but distributions over ancestors/descendants
Handles out-of-equilibrium systems
explicitly designed for systems where ℙt changes over time
Growth-rate modeling is critical
without it, rapidly proliferating iPSCs incorrectly inferred to arise from apoptotic stromal cells
Gradual fate divergence
fates emerge gradually, not at sharp branch points; challenges graph-theoretic trajectory models