TI Benchmarking

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/70

There's no tags or description

Looks like no tags are added yet.

Last updated 1:31 AM on 4/1/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

71 Terms

New cards

Single-cell omics data, including transcriptomics, proteomics and epigenomics data, provide new opportunities for studying cellular dynamic processes

such as the cell cycle, cell differentiation and cell activation

New cards

Such dynamic processes can be modeled computationally using trajectory inference (TI) methods

also called pseudotime analysis, which order cells along a trajectory based on similarities in their expression patterns

New cards

The resulting trajectories are most often linear, bifurcating or tree-shaped

but more recent methods also identify more complex trajectory topologies, such as cyclic or disconnected graphs

New cards

TI methods offer an unbiased and transcriptome-wide

understanding of a dynamic process

New cards

thereby allowing the objective identification of new (primed)

subsets of cells, delineation of a differentiation tree, and inference of regulatory interactions responsible for one or more bifurcations

New cards

Current applications of TI focus on specific subsets of cells

but ongoing efforts to construct transcriptomic catalogs of whole organisms underline the urgency for accurate, scalable, and user-friendly TI methods.

New cards

two of the most distinctive differences between TI methods

are whether they fix the topology of the trajectory and what type(s) of graph topologies they can detect

New cards

Early TI methods typically fixed the topology algorithmically

for example, linear or bifurcating trajectories or through parameters provided by the user

New cards

These methods therefore mainly focus on correctly ordering

the cells along the fixed topology

New cards

More recent methods also infer the topology

which increases the difficulty of the problem at hand, but allows the unbiased identification of both the ordering inside a branch and the topology connecting these branches

New cards

Given the diversity in TI methods

it is important to quantitatively assess their performance, scalability, robustness and usability.

New cards

Many attempts at tackling this issue have already been made

but a comprehensive comparison of TI methods across a large number of different datasets is still lacking

New cards

This is problematic, as new users to the field are confronted with an overwhelming choice of TI methods,

without a clear idea of which would optimally solve their problem

New cards

Moreover, the strengths and weaknesses of existing methods need to be assessed,

so that new developments in the field can focus on improving the current state-of-the-art.

New cards

We found substantial complementarity between current methods

with different sets of methods performing most optimally depending on the characteristics of the data

New cards

In this model, the overall topology is represented by a network of ‘milestones’,

and the cells are placed within the space formed by each set of connected milestones.

New cards

Although almost every method returned a unique set of outputs

we were able to classify these outputs into seven distinct groups and we wrote a common output converter for each of these groups

New cards

When strictly required

we also provided prior information to the method

New cards

weak priors that are relatively easy to acquire, such as a start cell

strong priors, such as a known grouping of cells, that are much harder to know a priori, and which can potentially introduce a large bias into the analysis

New cards

The largest difference between TI methods is whether a method fixes the topology and

if it does not, what kind of topology it can detect.

New cards

Most methods either focus on inferring linear trajectories

or limit the search to tree or less complex topologies, with only a selected few attempting to infer cyclic or disconnected topologies

New cards

We evaluated each method on four core aspects:

(1) accuracy of a prediction, given a gold or silver standard on 110 real and 229 synthetic datasets; (2) scalability with respect to the number of cells and features (for example, genes); (3) stability of the predictions after subsampling the datasets; and (4) the usability of the tool in terms of software, documentation and the manuscript

New cards

Overall, we found a large diversity across the four evaluation criteria

with only a few methods, such as PAGA, Slingshot and SCORPIUS, performing well across the board

New cards

the topology

Hamming–Ipsen–Mikhailov, HIM

New cards

the quality of the assignment of cells to branches

F1branches

New cards

the cell positions

cordist

New cards

accuracy of the differentially expressed features along the trajectory

wcorfeatures

New cards

synthetic datasets

offer the most exact reference trajectory

New cards

real datasets

highest biological relevance

New cards

real datasets come from

a variety of single-cell technologies, organisms and dynamic processes, and contain several types of trajectory topologies

New cards

Real datasets were classified as ‘gold standard’ if

the reference trajectory was not extracted from the expression data itself, such as via cellular sorting or cell mixing

New cards

All other real datasets were classified as

‘silver standard’

New cards

For synthetic datasets we used several data simulators,

including a simulator of gene regulatory networks using a thermodynamic model of gene regulation

New cards

For each simulation, we used a real dataset as a reference

to match its dimensions, number of differentially expressed genes, drop-out rates and other statistical properties

New cards

We found that method performance was very variable across datasets, indicating that there is no ‘one-size-fits-all’ method

that works well on every dataset

New cards

Even methods that can detect most of the trajectory types, such as PAGA, RaceID/StemID and SLICER were not the best methods

across all trajectory types

New cards

The overall score between the different dataset sources was

moderately to highly correlated (Spearman rank correlation between 0.5–0.9) with the scores on real datasets containing a gold standard confirming both the accuracy of the gold standard trajectories and the relevance of the synthetic data

New cards

On the other hand, the different metrics frequently disagreed with each other,

with Monocle and PAGA Tree scoring better on the topology scores, whereas other methods, such as Slingshot, were better at ordering the cells and placing them into the correct branches

New cards

The performance of a method was strongly dependent on the

type of trajectory present in the data

New cards

Slingshot typically performed better on datasets containing more simple topologies, while PAGA, pCreode and RaceID/StemID

had higher scores on datasets with trees or more complex trajectories

New cards

This was reflected in the types of topologies detected by every method

as those predicted by Slingshot tended to contain less branches, whereas those detected by PAGA, pCreode and Monocle DDRTree gravitated towards more complex topologies

New cards

This analysis therefore indicates that detecting the right topology is still a difficult task for most of these methods

because methods tend to be either too optimistic or too pessimistic regarding the complexity of the topology in the data.

New cards

The high variability between datasets, together with the diversity in detected topologies

between methods, could indicate some complementarity between the different methods

New cards

A top model in this case was defined as a model with an overall score of at least

95%

New cards

On all datasets, using one method resulted in getting a top model about

27% of the time.

New cards

This increased up to 74%

with the addition of six other methods

New cards

The result was a relatively diverse set of methods

containing both strictly linear or cyclic methods, and methods with a broad trajectory type range such as PAGA

New cards

We found similar indications of complementarity between the top methods on data containing only

linear, bifurcation or multifurcating trajectories, although in these cases less methods were necessary to obtain at least one top model for a given dataset.

New cards

Moreover, the recent application of TI methods on multi-omics single-cell data also showcases

the increasing demands on the number of features

New cards

To assess the scalability

we ran each method on up- and downscaled versions of five distinct real datasets

New cards

We modeled the running time and memory usage using a

Shape Constrained Additive Model

New cards

we compared the predicted time (and memory) with the actual time (respectively memory) on all benchmarking datasets

and found that these were highly correlated overall (Spearman rank correlation >0.9, Supplementary Fig. 5), and moderately to highly correlated (Spearman rank correlation of 0.5–0.9) for almost every method

New cards

Methods with a low running time typically had two defining aspects

they had a linear time complexity with respect to the features and/or cells, and adding new cells or features led to a relatively low increase in time

New cards

We found that more than half of all methods had a quadratic or superquadratic complexity

with respect to the number of cells, which would make it difficult to apply any of these methods in a reasonable time frame on datasets with more than a thousand cells

New cards

Most methods had reasonable memory requirements for modern workstations or computer clusters (≤12 GB)

with PAGA and STEMNET in particular having a low memory usage with both a high number of cells or a high number of features.

New cards

Given that the trajectories of methods that fix the topology either algorithmically or through a parameter are already very constrained

it is to be expected that such methods tend to generate very stable results.

New cards

Nonetheless, some fixed topology methods still produced slightly more stable results

such as SCORPIUS and MATCHER for linear methods and MFA for multifurcating methods

New cards

Stability was much more diverse among

methods with a free topology

New cards

We found that most methods fulfilled the basic criteria

such as the availability of a tutorial and elemental code quality criteria

New cards

code assurance and documentation in particular were problematic areas,

notwithstanding several studies pinpointing these as good practices

New cards

Only two methods had a nearly perfect usability score

Slingshot and Celltrails

New cards

it is critical that a trajectory, and the downstream results and/or hypotheses originating from it,

are confirmed by multiple TI methods.

New cards

This is to make sure that the prediction is not biased due to the given parameter setting or the particular algorithm

underlying a TI method

New cards

The value of using different methods is further supported by our analysis indicating substantial complementarity

between the different methods

New cards

even if the expected topology is known, it can be beneficial to also try out methods that make less assumptions

about the trajectory topology

New cards

When the expected topology is confirmed using such a method

it provides additional evidence to the user

New cards

Critical to the broad applicability of TI methods is the standardization of the input and output interfaces of TI methods

so that users can effortlessly execute TI methods on their dataset of interest, compare different predicted trajectories and apply downstream analyses, such as finding genes important for the trajectory, network inference or finding modules of genes

New cards

Foremost, new methods should focus on improving the unbiased inference of

tree, cyclic graph and disconnected topologies, as we found that methods repeatedly overestimate or underestimate the complexity of the underlying topology, even if the trajectory could easily be identified using a dimensionality reduction method

New cards

Finally, new tools should be designed to scale well with

the increasing number of cells and features

New cards

We found that the performance of a method can be very variable between datasets,

and therefore included a large set of both real and synthetic data within our evaluation, leading to a robust overall ranking of the different methods

New cards

Some examples for the latter include PhenoPath

which can include additional covariates in its model