REQ READING Statistical and Causal Models – Key Vocabulary

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/32

Earn XP

Description and Tags

Vocabulary flashcards summarizing essential terms and definitions from the lecture notes on statistical learning, causal models, and illustrative examples.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

33 Terms

New cards

Statistical learning

The field that infers properties of an unknown probability distribution from observed data, typically for prediction.

New cards

Causal inference

The study of identifying and quantifying cause-and-effect relationships, often involving multiple distributions produced by interventions.

New cards

Probability space

A mathematical model (Ω,𝔽,𝑃) consisting of outcomes, events, and a probability measure for a random experiment.

New cards

Independent and identically distributed (i.i.d.)

An assumption that each sample is drawn independently from the same joint distribution.

New cards

Regression (conditional expectation)

The function f(x)=E[Y|X=x] giving the expected output value for a given input.

New cards

Binary classifier

A function that assigns each input x to the more likely class y∈{−1,+1} under P(Y|X=x).

New cards

Joint distribution (P_{X,Y})

The probability law governing the simultaneous behavior of random variables X and Y.

New cards

Empirical distribution (P_n)

A discrete distribution that puts equal mass 1/n on each observed data point in a sample.

New cards

Inverse problem (statistics)

Estimating properties of an unobserved distribution from data generated by that distribution.

New cards

Function class / Hypothesis space

The set of candidate functions from which a learning algorithm selects its predictor.

New cards

Capacity (of a function class)

A measure of how rich or complex a hypothesis space is, controlling overfitting potential.

New cards

Vapnik–Chervonenkis (VC) dimension

A combinatorial capacity measure indicating the largest set of points that can be shattered by a function class.

New cards

Expected risk (true risk)

The population loss R[f]=∫(1/2)|f(x)−y| dP_{X,Y}(x,y) measuring generalization error.

New cards

Empirical risk

The average loss on the training sample Remp^n[f]=(1/n)∑{i}(1/2)|f(xi)−yi|.

New cards

Empirical Risk Minimization (ERM)

The principle of choosing the hypothesis that minimizes empirical risk over the training data.

New cards

Consistency (of a learner)

Property that the risk of the learned functions converges to the minimal achievable risk as n→∞.

New cards

Universal consistency

A guarantee that, for every fixed underlying distribution, the algorithm approaches Bayes-optimal risk with enough data.

New cards

Slow learning rates

Situations where convergence toward optimal risk can be arbitrarily slow for some problems, even with consistent algorithms.

New cards

Regularization

A technique that restricts or penalizes complex hypotheses to control capacity and improve generalization.

New cards

Bayesian prior

A probability distribution placed over hypotheses or parameters expressing a priori beliefs before seeing data.

New cards

Observational distribution

The joint distribution of variables obtained without intervening in the system.

New cards

Intervention

An external action that forces a variable to take specific values, potentially altering the joint distribution.

New cards

Structural Causal Model (SCM)

A collection of assignments X:=f(paX, NX) defining each variable as a function of its parents and an independent noise term.

New cards

Causal reasoning

Deriving implications (e.g., effects of interventions) from a known causal model.

New cards

Causal learning / Structure learning

Inferring aspects of the underlying causal graph or mechanisms from data (observational or interventional).

New cards

Reichenbach's common cause principle

If X and Y are dependent, there exists a variable Z that causally influences both and renders them independent when conditioned upon.

New cards

Confounder

A variable that causally affects two or more variables, creating spurious associations between them.

New cards

Screening-off

The property that conditioning on a confounder Z makes its effects (e.g., X and Y) statistically independent: X ⫫ Y | Z.

New cards

Correlation ≠ Causation

The principle that statistical dependence alone does not determine causal direction or presence.

New cards

Mechanism (in SCM)

The deterministic function linking a variable to its direct causes and noise term in an SCM.

New cards

Additive Noise Model (ANM)

A causal model where a child variable equals a function of its parent plus independent additive noise.

New cards

Optical character recognition example

Illustration that identical PX,Y for images and labels can arise from different causal structures, yielding different intervention effects.

New cards

Gene perturbation example

Scenario showing that deleting a gene (intervention) affects phenotype only if a causal, not merely correlated, relationship exists.