1/27
Flashcards covering vocabulary, formulas, and concepts from the lecture notes on empirical research, regression analysis, and causal inference.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Descriptive Questions
Questions that ask "What is?" or "How much?" with the goal of summarizing and describing data, such as the annual earnings of MBA graduates.
Selection Bias
A source of endogeneity where some other factor (X) causes both the treatment (D) and the outcome (Y), making it difficult to isolate the causal effect.
Y1i and Y0i
Represent the potential outcomes for individual i if they are treated (1) or not treated (0) respectively, within the Potential Outcomes Framework.
Average Treatment Effect (ATE)
The formal causal measure expressed as E[Y1i−Y0i].
Average Treatment Effect on the Treated (ATT)
The causal effect on those individuals who actually received the treatment, expressed as E[Y1i−Y0i∣Di=1].
Random Assignment
A technique used in Randomized Controlled Trials (RCTs) that makes the treatment (D) independent of all characteristics, eliminating selection bias.
Correspondence Test
A research design used by BM2004 where fake resumes were sent to real job ads to test labor market discrimination.
Linear Probability Model (LPM)
An OLS regression where the dependent variable is a dummy variable; it interprets the coefficient as the percentage-point change in probability.
Balancing Test
A check for randomization validity showing there are no statistically significant differences in observable characteristics (Xi) between treatment and control groups.
Omitted Variable Bias (OVB) Formula
The mathematical relationship β1_OLS−β1=γ×π1 which estimates bias based on the effect of and covariance with an omitted variable.
Conditional Independence Assumption (CIA)
States that, after controlling for a set of observable characteristics (X), the treatment assignment (D) is independent of the potential outcomes: (Y0i,Y1i)⊥Di∣Xi.
Bad Control
A variable that is itself a potential outcome of the treatment (post-treatment characteristic), which can bias results if included in the regression.
Winsorizing
A remedy for outliers that replaces values below the 1st percentile and above the 99th percentile with the value of those specific percentiles.
Panel Data
Data consisting of multiple entities (n) observed at two or more time periods (T), such as company-year observations.
Construct
An abstract idea that is not directly observable (e.g., "intelligence" or "audit quality") and must be operationalized into measurable variables.
Exogeneity Assumption
The OLS requirement that independent variables (Xit) are uncorrelated with unobservable errors (ϵit), denoted as E[ϵit∣Xit]=0.
Moderating Variable
A factor that weakens or strengthens the relation between a dependent and independent variable, making the effect of X on Y conditional on Z.
Libby Boxes
A predictive validity framework used to map the causal relation between conceptual constructs and their operational proxies through five distinct links.
Cluster-Robust Standard Errors
Standard errors used in panel data to account for errors that are correlated across observations within the same group, such as firm-level or industry-level clusters.
Cumulative Abnormal Return (CAR)
The sum of abnormal returns over an event window in an event study; example: CAR[−1,+1]=AR−1+AR0+AR+1.
Endogeneity
Occurs when an explanatory variable (X) is correlated with the error term (\epsilon), leading to biased and inconsistent OLS estimates.
Internal Validity
The extent to which a study accurately captures a causal effect of X on Y while eliminating alternative hypotheses.
Difference-in-Differences (DiD)
A design comparing the change in Y for a treatment group against the change in Y for a control group, before and after a treatment.
Instrumental Variables (IV)
A method to address endogeneity using an instrument (Z) that is relevant (correlated with X) and exogenous (uncorrelated with the error term).
Exclusion Restriction
The non-testable assumption in IV models that the instrument (Z) has no direct effect on the outcome (Y) except through the endogenous variable (X).
Parallel Trends Assumption
The key identifying assumption for DiD stating that, without treatment, the treated and control groups would have followed the same trend over time.
Fixed Effects (FE)
Controls in panel regressions for all time-invariant entity-specific characteristics (firm fixed effects) or common time-specific shocks (time fixed effects).
Regression Discontinuity Design (RDD)
An experimental framework used when treatment is assigned based on a sharp threshold, comparing observations just above and below that rule.