1/29
Collection of vocabulary flashcards covering key concepts, methods, statistics, and software tools mentioned in the lecture segment on methods for identifying causal direction and structure.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Hilbert-Schmidt Independence Criterion (HSIC)
A kernel–based statistic that measures statistical dependence between two variables and is often used to test independence in causal discovery.
Additive Noise Model (ANM)
A causal model of the form Y = f(X)+N where the noise N is statistically independent of the cause X.
Nonlinear Regression Model
A regression framework where the conditional mean E[Y|X] is described by an arbitrary nonlinear function f(X).
Post-Nonlinear Model
A causal model Y = g(f(X)+N) in which a nonlinear distortion g is applied after an additive noise component.
Maximum-Likelihood-Based Approach (for causal direction)
Method that distinguishes X→Y from Y→X by comparing the likelihoods obtained after fitting regression models in both directions.
Residuals
The differences between observed responses and the fitted regression values; used here to compute likelihood scores and independence tests.
Likelihood Score L_{X→Y}
Measure −log var(X) − log var(R_Y) derived from residual variances to compare causal directions under Gaussian noise assumptions.
Differential Entropy
A continuous analogue of Shannon entropy, used to replace log-variance in likelihood scores when noise is non-Gaussian.
dHSIC (R package)
Software providing the function dhsic.test, an implementation of HSIC-based independence testing.
mgcv (R package)
R package implementing generalized additive models (GAM); used for nonlinear regressions in causal discovery code examples.
gam() Function
Function from mgcv that fits generalized additive models; used to model nonlinear relationships when computing residuals.
Independence Test
Statistical test (e.g., HSIC) used to decide whether residuals are independent of predictors, a key step in ANM orientation.
Information-Geometric Causal Inference (IGCI)
Method that infers causal direction between two variables by exploiting geometric properties of their distributions without noise modelling.
Estimator (\hat C_{X→Y})
Empirical quantity in IGCI computed from ordered data to compare complexity measures in both directions.
Slope-Based Approach (IGCI)
IGCI variant using log-slope averages to decide causal direction from independent mechanism principle.
Entropy-Based Approach (IGCI)
IGCI variant that compares differential entropies H(X) and H(Y); the variable with larger entropy is inferred as the cause.
Differential Shannon Entropy
Integral H(X)=−∫p(x)log p(x)dx measuring uncertainty of continuous variables, used in IGCI.
Trace Method
Causal orientation technique for high-dimensional linear relations that uses traces of covariance and structure matrices.
Tracial Dependency Ratio r_{X→Y}
Statistic t(AY SXX AY^T) / [ t(AY AY^T) t(SXX) ] whose closeness to 1 indicates causal plausibility in the trace method.
Free Probability Theory
Mathematical framework for describing asymptotics of large random matrices; used to extend the trace method to sample-poor, high-dimensional regimes.
Random Orthogonal Map
Rotation used in simulations to enforce independence between structure matrix A and covariance S_XX when assessing significance in the trace method.
Supervised Causal Learning
Approach that treats causal direction identification as a classification task trained on labeled cause-effect data sets.
Cause-Effect Pairs Database
Public repository of real-world variable pairs with known causal directions, used for benchmarking causal discovery methods.
Hand-Crafted Features (for causal classifiers)
Manually designed statistics (e.g., entropy of residuals) extracted from data sets to train direction classifiers.
Residual Entropy Feature
Entropy of regression residuals used as an input feature in supervised causal learning; relates to ANM scores.
Reproducing Kernel Hilbert Space (RKHS)
Functional space associated with a kernel; empirical distributions mapped here enable kernel classifiers for causal orientation.
Kernel Mean Embedding
Representation of a distribution as the RKHS mean; foundation for mapping whole data sets into feature space in causal classification.
Time-Series Reversal Classification
Technique similar to causal direction classification, used to decide whether a time series has been reversed in time.
Identifiability
Property that the true causal direction can be recovered uniquely (up to errors) from the joint distribution under model assumptions.
Additive Gaussian Error Terms
Assumption that noise variables in an SCM follow independent Gaussian distributions; simplifies likelihood comparisons.