10 Fundamental Theorems for Econometrics

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/92

Earn XP

Description and Tags

Vocabulary flashcards for key terms and concepts in econometrics, designed to help students prep for exams.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

93 Terms

New cards

Law of Iterated Expectations (LIE)

The expected value of X is equal to the expectation over the conditional expectation of X given Y: E[X] = E[E[X\|Y]].

New cards

Law of Total Variance (LTV)

var[Y] = E[var[Y\|X]] + var(E[Y\|X]) .

New cards

Linearity of Expectations (LOE)

E[aX + bY] = aE[X] + bE[Y] , where a and b are real numbers.

New cards

Variance of a Sum (VOS)

var(X + Y) = var(X) + var(Y) + 2Cov(X, Y) , if X and Y are correlated.

New cards

Jensen’s Inequality

E[g(X)] \geq g(E[X]) for a convex function g.

New cards

Chebyshev’s Inequality

P(\|Z\| \geq k) \leq 1/k^2 for a random variable Z.

New cards

Weak Law of Large Numbers (WLLN)

The sample mean converges in probability to the expected value as sample size increases: \bar{X}\_n \xrightarrow{p} E[X].

New cards

Central Limit Theorem (CLT)

As sample size increases, the distribution of the sample mean converges to a normal distribution: \sqrt{n}(\bar{X}\_n - \mu) \xrightarrow{d} N(0, \sigma^2)}.

New cards

Slutsky’s Theorem

If Xn converges in distribution to X, and An converges in probability to a constant a, then AnXn converges in distribution to aX.

New cards

Delta Method

An approximation technique for the asymptotic behavior of functions of random variables that are asymptotically normal.

New cards

Frisch-Waugh-Lovell Theorem (FWL)

Each predictor's regression coefficient in a multivariate model equals the regression coefficient estimated from a bivariate model of the residualized components.

New cards

Positive Definite (PD) Matrix

An n × n symmetric matrix M is positive definite if x'Mx > 0 for all non-zero x.

New cards

Positive Semi-Definite (PSD) Matrix

An n × n symmetric matrix M is positive semi-definite if x'Mx \geq 0 for all non-zero x.

New cards

Big O Notation (Op)

Characterizes the convergence in probability of a set of random variables.

New cards

Little o Notation (op)

Refers to convergence in probability towards zero.

New cards

Expectation Notation

Used to denote the average value of a random variable, often written as E[X]}.

New cards

Asymptotic Equivalence Lemma

Describes the conditions under which two estimates behave similarly in large samples.

New cards

Variance Definition

var(X) = E[X^2] - (E[X])^2 It measures the dispersion of a random variable around its mean. Variance quantifies the degree of variation or spread in a set of values. .

New cards

Covariance Definition

Cov(X, Y) = E[XY] - E[X]E[Y] . It measures the degree to which two random variables change together, indicating the direction of their relationship.

New cards

Regression Coefficient

A value representing the change in the dependent variable for a one-unit change in the independent variable.

New cards

Ordinary Least Squares (OLS)

A method for estimating the parameters in a linear regression model by minimizing the sum of squared residuals. \min{\beta} \sum{i=1}^n (yi - xi' \beta)²

New cards

Linear Projection

The process of projecting a point onto a set of predictors.

New cards

Residuals

The difference between the observed values and the predicted values from a regression model.

New cards

Idempotent Matrix

A matrix P such that P^2 = P.

New cards

Algebra of Limits

Rules governing the limits in calculus.

New cards

Independence of Random Variables

Two random variables X and Y are independent if P(X, Y) = P(X)P(Y) for all values.

New cards

Symmetric Matrix

A matrix that is equal to its transpose.

New cards

Matrix Multiplication

The dot product of rows and columns of matrices.

New cards

Eigenvalues

Scalars that describe the factor by which the eigenvector is stretched or compressed during a linear transformation.

New cards

Eigenvectors

Non-zero vectors that change by only a scalar factor during a linear transformation.

New cards

Statistical Significance

Indicates that the effect observed is unlikely to be due to chance.

New cards

Homoscedasticity

Assumption that the variance of the errors is constant across observations.

New cards

Multicollinearity

A situation in which two or more independent variables in a regression model are highly correlated.

New cards

Outlier

An observation that lies an abnormal distance from other values in a dataset.

New cards

Statistical Power

The probability of rejecting the null hypothesis when it is false.

New cards

Confidence Interval

A range of values that is likely to contain the population parameter with a specified level of confidence.

New cards

Hypothesis Testing

A method for testing a claim or hypothesis about a parameter in a population.

New cards

P-value

The probability of observing the test statistic or something more extreme under the null hypothesis.

New cards

Null Hypothesis

The hypothesis that there is no significant difference or effect.

New cards

Alternative Hypothesis

The hypothesis that there is a significant difference or effect.

New cards

Significance Level (α)

The probability of rejecting the null hypothesis when it is true, denoted as \alpha.

New cards

Type I Error

Rejecting the null hypothesis when it is actually true, with probability P(\text{Type I Error}) = \alpha.

New cards

Type II Error

Failing to reject the null hypothesis when it is actually false, with probability P(\text{Type II Error}) = \beta.

New cards

Statistical Model

A representation of a process that generates data; often includes assumptions about the data.

New cards

Sample Space

The set of all possible outcomes of a random variable.

New cards

Random Variable

A variable that can take on different values, each with a certain probability.

New cards

Discrete Random Variable

A random variable that can take on a countable number of values.

New cards

Continuous Random Variable

A random variable that can take on an infinite number of values within a given range.

New cards

Probability Distribution

A function that describes the likelihood of obtaining the possible values that a random variable can take.

New cards

Bayes’ Theorem

A mathematical formula for determining conditional probabilities: P(A\|B) = \frac{P(B\|A)P(A)}{P(B)}}.

New cards

Central Tendency

A statistical measure that identifies a single value as representative of an entire distribution.

New cards

Dispersion

The measure of how far a set of numbers is spread out from their average value.

New cards

Quantile

A statistical term that describes the division of a data set into equal partitions.

New cards

Sample Mean

The average of a subset of the population: \bar{X} = \frac{1}{n}\sum{i=1}^n Xi .

New cards

Population Mean

The average of all possible observations from the entire population: \mu = E[X]}.

New cards

Standard Deviation

A measure of the amount of variation or dispersion of a set of values, calculated as the square root of the variance: \sigma = \sqrt{Var(X)}}.

New cards

Variance

The square of the standard deviation.

New cards

Confidence Level

The percentage of times that a confidence interval would include the population parameter if a study were to be repeated.

New cards

Bootstrap Method

A resampling method used to estimate the distribution of a statistic.

New cards

Maximum Likelihood Estimation (MLE)

A method of estimating parameters of a statistical model by maximizing the likelihood function, L(\theta\|x) = P(x\|\theta)}.

New cards

Bayesian Statistics

A statistical paradigm that involves using Bayes' theorem to update the probability as more evidence becomes available.

New cards

Normal Distribution

A symmetric probability distribution characterized by its bell-shaped curve, with probability density function: f(x\|\mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}}.

New cards

Logistic Regression

A statistical model used to predict binary outcomes.

New cards

Multivariate Regression

A regression analysis involving two or more independent variables.

New cards

Panel Data

Data that involves observations over multiple time periods for the same subjects.

New cards

Cross-Sectional Data

Data collected at a single point in time across multiple subjects.

New cards

Time Series Data

Data collected sequentially over time on the same subject.

New cards

Causal Inference

The process of drawing a conclusion about a causal connection based on the conditions of the occurrence of an effect.

New cards

Treatment Effect

The impact of a treatment on an outcome in a statistical study.

New cards

Confounding Variable

A variable that influences both the independent variable and the dependent variable.

New cards

Instrumental Variable

A variable used in regression analysis that is correlated with the independent variable and uncorrelated with the error term.

New cards

Endogeneity

A situation in which an explanatory variable is correlated with the error term.

New cards

Exogeneity

A situation in which an explanatory variable is uncorrelated with the error term.

New cards

Statistical Inference

The process of using data analysis to deduce properties of an underlying probability distribution.

New cards

F-Test

A statistical test used to determine if there are significant differences between the means of two or more groups, often using the F-statistic: F = \frac{\text{variance between groups}}{\text{variance within groups}}}.

New cards

T-Test

A statistical test used to compare the means of two groups, often using the t-statistic: t = \frac{\bar{x}1 - \bar{x}2}{sp\sqrt{\frac{1}{n1} + \frac{1}{n_2}}}}.

New cards

Chi-Squared Test

A statistical test used to determine if there is a significant association between two categorical variables: \chi^2 = \sum \frac{(Oi - Ei)^2}{E_i} .

New cards

ANOVA (Analysis of Variance)

A statistical method used to compare the means of three or more groups.

New cards

Regression Discontinuity Design

A quasi-experimental pretest-posttest design that emulates random assignment based on a cutoff score.

New cards

Difference in Differences

A statistical technique used to estimate causal relationships by comparing the differences in outcomes over time between a treatment group and a control group.

New cards

Propensity Score Matching

A statistical technique used to control for confounding by matching treated and untreated subjects with similar characteristics.

New cards

Hazard Function

A function that describes the instant risk of an event occurring at a given time, defined as h(t) = \lim_{\Delta t \to 0} \frac{P(t \le T < t + \Delta t \| T \ge t)}{\Delta t}.

New cards

Survival Analysis

A branch of statistics for analyzing the expected duration until one or more events happen.

New cards

Risks and Uncertainty

Assessment of uncertainty that affects potential outcomes of a decision.

New cards

Random Sampling

A technique used to select individuals from a population in such a way that every individual has an equal chance of being selected.

New cards

Sampling Bias

A bias that occurs when the sample is not representative of the population from which it was drawn.

New cards

How does the Central Limit Theorem (CLT) relate to statistical inference?

The CLT provides the theoretical basis for constructing confidence intervals and performing hypothesis tests for population parameters, particularly means, as it ensures that the sample mean's distribution approximates a normal distribution for large sample sizes.

New cards

What is the relationship between Endogeneity and Instrumental Variables?

Instrumental Variables are a method specifically designed to address Endogeneity in regression models by using an instrument that is correlated with the endogenous explanatory variable but uncorrelated with the error term, allowing for consistent estimation of causal effects.

New cards

How does the Frisch-Waugh-Lovell (FWL) Theorem relate to Ordinary Least Squares (OLS) regression?

The FWL theorem shows that OLS coefficients in a multiple regression can be understood as the result of a two-step process: first, residualizing the dependent and independent variables with respect to other regressors, and then regressing these residuals. This highlights the partial effect of each predictor in OLS.

New cards

What is the foundational concept behind the Law of Iterated Expectations (LIE)?

The LIE is fundamentally based on conditional expectation, articulating that the expected value of a random variable can be obtained by taking the expectation of its conditional expectation given another random variable: E[X] = E[E[X\|Y]]}.

New cards

How is Bayes' Theorem integral to Bayesian Statistics?

Bayes' Theorem serves as the core principle of Bayesian Statistics, providing the mathematical framework for updating the probability of a hypothesis (prior distribution) based on new observed data (likelihood) to obtain a revised probability (posterior distribution).

New cards

Projection Matrix (P)

A symmetric and idempotent matrix that transforms a vector into its orthogonal projection onto a subspace. For a matrix X with linearly independent columns, the projection matrix onto the column space of X is given by P = X(X'X)^{-1}X'.

New cards

Orthogonality Principle (in OLS)

In Ordinary Least Squares (OLS) regression, the residuals are orthogonal to the columns of the design matrix (X) of the predictors. This means their inner product is zero: X'e = 0 or E[xi ei] = 0.