Chapter 16 – Econometric Tools for Causal Inference: Random Assignment, Natural Experiments, and Panel Fixed/Random Effects
Random Assignment Experiments (RAEs)
- Concept & Purpose
- Employ an explicit experimental design based on random assignment to isolate causal effect of a treatment.
- Gold‐standard approach in economics for inferring causality because—under ideal conditions—randomization ensures treatment status is uncorrelated with all other factors.
- Canonical Two-Group Design
- Step 1 – Recruit a sample of subjects.
- Step 2 – Randomly allocate subjects into
- Treatment group: receives the intervention.
- Control group: receives a placebo / business-as-usual condition.
- Expected Result: The only systematic difference between the groups is the treatment itself.
- Baseline Specification (Differences Estimator)
- OUTCOME<em>i=β</em>0+β<em>1TREATMENT</em>i+εi
- $OUTCOME_i$ : measured result for individual $i$ (e.g., earnings, test score).
- $TREATMENT_i$ : indicator (1 = treated, 0 = control).
- $\beta_1$ : differences estimator; average treatment effect (ATE).
- $\varepsilon_i$ : composite of all remaining unexplained factors.
- Augmented Specification (Equation 16.2)
- If important covariates are observable, include them to improve precision:
OUTCOME<em>i=β</em>0+β<em>1TREATMENT</em>i+β<em>2X</em>1i+⋯+β<em>kX</em>ki+εi - Controls ($X{ji}$) soak up residual variance and reduce standard errors of $\beta1$.
- Practical Limitations in Economics
- Non-random samples – volunteers or convenience samples may limit external validity.
- Unobservable heterogeneity – randomization balances in expectation, but with small samples imbalance is possible.
- Hawthorne Effect – subjects alter behaviour simply because they know they are studied.
- Impossible or unethical experiments – e.g., assigning unemployment shocks, changing taxation status.
Natural (Quasi-) Experiments
- Definition & Rationale
- Treatment and control status arise from an exogenous event (law change, acquisition, catastrophe) rather than researcher assignment.
- Aim: Exploit real-world shocks that mimic randomization.
- Key Methodological Twist
- Analysts compare changes (pre- vs post-) rather than levels because initial group characteristics may differ.
- Difference-in-Differences (DiD) Framework
- Dependent variable becomes the individual-specific change:
ΔOUTCOME<em>i=β</em>0+β<em>1TREATMENT</em>i+β<em>2X</em>1i+⋯+εi - $\beta_1$ : difference-in-differences estimator; difference between the treatment group’s change and control group’s change.
- Identification assumption: Parallel Trends—in absence of treatment, average trends would be the same across groups.
- Illustrative Example – ARCO–Thrifty Acquisition (1997)
- Event: ARCO buys Thrifty Oil; fear of reduced competition.
- Groups
- Treatment: ARCO stations within 1 mile of a former Thrifty.
- Control: ARCO stations with no nearby Thrifty competitor.
- Data: Station-level prices before and after acquisition.
- Finding: Treatment group’s post-acquisition prices rose relative to controls, implying anti-competitive effect.
- Fig 16.1 (visual): Prices lower for treatment pre-merger, higher post-merger; underscores need for DiD rather than simple means comparison.
Panel (Longitudinal) Data
- Definition
- Repeated observations on the same cross-sectional units across two or more periods.
- Combines advantages of time-series (dynamic patterns) and cross-sectional (individual variation) data.
- Major Public Panel Surveys Mentioned
- 1979 National Longitudinal Survey of Youth (NLSY‐79).
- Panel Survey of Income Dynamics (PSID).
- British Household Panel Survey.
- Canadian National Public Health Survey.
- Advantages
- Enables questions unanswerable with pure cross-sections (e.g., within-person change).
- Expands effective sample size via NT observations vs N.
- Four Variable Types in Panels
- Between-entity-only: differ across units, constant over time (e.g., race, long-run geography).
- Time-only: vary over time, identical across entities (e.g., nationwide inflation rate in year $t$).
- Both vary: differ across units and time (e.g., income, employment status).
- Trend variables: deterministic functions of time (e.g., $t$, $t^2$).
Fixed Effects (FE) Model
- Core Specification
Y<em>it=β</em>0+β<em>1X</em>it+EF<em>i+TF</em>t+εit
- $EF_i$ : $N-1$ entity dummy variables (state, individual, firm) capturing time-invariant heterogeneity.
- $TF_t$ : $T-1$ time dummies capturing shocks common to all units during period $t$ (macro conditions, policy environment).
- Motivation
- If underlying model without fixed effects is
Y<em>it=β</em>0+β<em>1X</em>it+v<em>it,v</em>it=a<em>i+z</em>t+ε<em>it
where $ai$ (unit traits) or $zt$ (time shocks) correlate with $X{it}$, Classical Assumption III (zero correlation between explanatory variables and error) is violated. - FE sweeps out $ai$ and $zt$ via dummy variables, converting $v{it}$ into iid-style $\varepsilon{it}$.
- Interpretation & Benefits
- Within-entity estimator: identifies effect of $X_{it}$ from deviations around each unit’s own mean.
- Corrects bias from omitted variables that do not change over time or affect all entities uniformly over time.
- Cost: Consumes degrees of freedom; cannot estimate coefficients on time-invariant regressors.
- Example – Death Penalty vs Murder Rate
- Simple cross-section (1990):
MRDRTE<em>i=α</em>0+α<em>1EXEC</em>i+ui - Suggested positive relationship—likely spurious (Fig 16.2).
- FE panel using 1990 & 1993 data flips sign: executions associated with lower murder rates when unobserved, constant state factors are controlled (Fig 16.3).
Random Effects (RE) Model & FE vs RE Choice
- RE Assumption
- Entity intercepts are random draws from a common distribution centred at a grand mean; treated as part of the error component.
- Advantages over FE
- More degrees of freedom—doesn’t estimate $N-1$ dummies.
- Can obtain coefficients for time-invariant variables (e.g., gender, region).
- Crucial Requirement
- The unobserved entity effect must be uncorrelated with all explanatory variables ($Cov(ai, X{it})=0$). Violation yields inconsistent RE estimates.
- Decision Rule
- Theoretical judgment: suspect correlation ⇒ use FE.
- Empirical diagnostic: Hausman Test
- Compares FE and RE coefficient vectors.
- Null: No systematic difference (supports RE).
- Reject ⇒ prefer FE.
Practical / Ethical / Philosophical Considerations
- RAEs vs Natural Experiments illustrate trade-off between internal vs external validity.
- FE/RE emphasise handling unobserved heterogeneity—central theme in econometrics and policy evaluation.
- Hawthorne Effect underlines measurement challenges: observation itself can change behaviour.
- Impossible experiments remind economists to exploit ingenuity (DiD, instrumental variables, regression discontinuity) for causal inference while honouring ethical constraints.
- Differences Estimator: β1=E[OUTCOME∣TREAT=1]−E[OUTCOME∣TREAT=0] under perfect randomization.
- Difference-in-Differences Estimator:
β<em>1=(ΔOUTCOME</em>treat)−(ΔOUTCOMEcontrol) - Fixed Effects Transformation (entity de-mean):
Y<em>it−Yˉ</em>i=β<em>1(X</em>it−Xˉ<em>i)+(ε</em>it−εˉi)
Study Tips & Connections
- Link back to earlier chapters on classical assumptions (endogeneity) and dummy variables (intercept shifts).
- Compare RAEs/Quasi-experiments with Instrumental Variables: both aim to recover unconfounded variation.
- When reading empirical papers, always ask: What is the identification strategy? Is the identifying assumption plausible? Would FE or RE handle omitted unobservables?
- Practitioners frequently combine tools (e.g., DiD with FE, two-way FE models) to strengthen causal claims.