Chapter 16 – Econometric Tools for Causal Inference: Random Assignment, Natural Experiments, and Panel Fixed/Random Effects

Random Assignment Experiments (RAEs)

Concept & Purpose
- Employ an explicit experimental design based on random assignment to isolate causal effect of a treatment.
- Gold‐standard approach in economics for inferring causality because—under ideal conditions—randomization ensures treatment status is uncorrelated with all other factors.
Canonical Two-Group Design
- Step 1 – Recruit a sample of subjects.
- Step 2 – Randomly allocate subjects into
- Treatment group: receives the intervention.
- Control group: receives a placebo / business-as-usual condition.
- Expected Result: The only systematic difference between the groups is the treatment itself.
Baseline Specification (Differences Estimator)
- $OUTCOMEi = \beta0 + \beta1\,TREATMENTi + \varepsilon_i$
- $OUTCOME_i$ : measured result for individual $i$ (e.g., earnings, test score).
- $TREATMENT_i$ : indicator (1 = treated, 0 = control).
- $\beta_1$ : differences estimator; average treatment effect (ATE).
- $\varepsilon_i$ : composite of all remaining unexplained factors.
Augmented Specification (Equation 16.2)
- If important covariates are observable, include them to improve precision:
 $OUTCOMEi = \beta0 + \beta1\,TREATMENTi + \beta2 X{1i} + \cdots + \betak X{ki} + \varepsilon_i$
- Controls ($X{ji}$) soak up residual variance and reduce standard errors of $\beta1$.
Practical Limitations in Economics
1. Non-random samples – volunteers or convenience samples may limit external validity.
2. Unobservable heterogeneity – randomization balances in expectation, but with small samples imbalance is possible.
3. Hawthorne Effect – subjects alter behaviour simply because they know they are studied.
4. Impossible or unethical experiments – e.g., assigning unemployment shocks, changing taxation status.

Natural (Quasi-) Experiments

Definition & Rationale
- Treatment and control status arise from an exogenous event (law change, acquisition, catastrophe) rather than researcher assignment.
- Aim: Exploit real-world shocks that mimic randomization.
Key Methodological Twist
- Analysts compare changes (pre- vs post-) rather than levels because initial group characteristics may differ.
Difference-in-Differences (DiD) Framework
- Dependent variable becomes the individual-specific change:
 $\Delta OUTCOMEi = \beta0 + \beta1\,TREATMENTi + \beta2 X{1i} + \cdots + \varepsilon_i$
- $\beta_1$ : difference-in-differences estimator; difference between the treatment group’s change and control group’s change.
- Identification assumption: Parallel Trends—in absence of treatment, average trends would be the same across groups.
Illustrative Example – ARCO–Thrifty Acquisition (1997)
- Event: ARCO buys Thrifty Oil; fear of reduced competition.
- Groups
- Treatment: ARCO stations within 1 mile of a former Thrifty.
- Control: ARCO stations with no nearby Thrifty competitor.
- Data: Station-level prices before and after acquisition.
- Finding: Treatment group’s post-acquisition prices rose relative to controls, implying anti-competitive effect.
- Fig 16.1 (visual): Prices lower for treatment pre-merger, higher post-merger; underscores need for DiD rather than simple means comparison.

Panel (Longitudinal) Data

Definition
- Repeated observations on the same cross-sectional units across two or more periods.
- Combines advantages of time-series (dynamic patterns) and cross-sectional (individual variation) data.
Major Public Panel Surveys Mentioned
- 1979 National Longitudinal Survey of Youth (NLSY‐79).
- Panel Survey of Income Dynamics (PSID).
- British Household Panel Survey.
- Canadian National Public Health Survey.
Advantages
1. Enables questions unanswerable with pure cross-sections (e.g., within-person change).
2. Expands effective sample size via NT observations vs N.
Four Variable Types in Panels
1. Between-entity-only: differ across units, constant over time (e.g., race, long-run geography).
2. Time-only: vary over time, identical across entities (e.g., nationwide inflation rate in year $t$).
3. Both vary: differ across units and time (e.g., income, employment status).
4. Trend variables: deterministic functions of time (e.g., $t$, $t^2$).

Fixed Effects (FE) Model

Core Specification $Y{it} = \beta0 + \beta1 X{it} + EFi + TFt + \varepsilon_{it}$
- $EF_i$ : $N-1$ entity dummy variables (state, individual, firm) capturing time-invariant heterogeneity.
- $TF_t$ : $T-1$ time dummies capturing shocks common to all units during period $t$ (macro conditions, policy environment).
Motivation
- If underlying model without fixed effects is
 $Y{it} = \beta0 + \beta1 X{it} + v{it},\; v{it}=ai+zt+\varepsilon{it}$ where $ai$ (unit traits) or $zt$ (time shocks) correlate with $X{it}$, Classical Assumption III (zero correlation between explanatory variables and error) is violated.
- FE sweeps out $ai$ and $zt$ via dummy variables, converting $v{it}$ into iid-style $\varepsilon{it}$.
Interpretation & Benefits
- Within-entity estimator: identifies effect of $X_{it}$ from deviations around each unit’s own mean.
- Corrects bias from omitted variables that do not change over time or affect all entities uniformly over time.
- Cost: Consumes degrees of freedom; cannot estimate coefficients on time-invariant regressors.
Example – Death Penalty vs Murder Rate
- Simple cross-section (1990):
 $MRDRTEi = \alpha0 + \alpha1 EXECi + u_i$
- Suggested positive relationship—likely spurious (Fig 16.2).
- FE panel using 1990 & 1993 data flips sign: executions associated with lower murder rates when unobserved, constant state factors are controlled (Fig 16.3).

Random Effects (RE) Model & FE vs RE Choice

RE Assumption
- Entity intercepts are random draws from a common distribution centred at a grand mean; treated as part of the error component.
Advantages over FE
1. More degrees of freedom—doesn’t estimate $N-1$ dummies.
2. Can obtain coefficients for time-invariant variables (e.g., gender, region).
Crucial Requirement
- The unobserved entity effect must be uncorrelated with all explanatory variables ($Cov(ai, X{it})=0$). Violation yields inconsistent RE estimates.
Decision Rule
- Theoretical judgment: suspect correlation ⇒ use FE.
- Empirical diagnostic: Hausman Test
- Compares FE and RE coefficient vectors.
- Null: No systematic difference (supports RE).
- Reject ⇒ prefer FE.

Practical / Ethical / Philosophical Considerations

RAEs vs Natural Experiments illustrate trade-off between internal vs external validity.
FE/RE emphasise handling unobserved heterogeneity—central theme in econometrics and policy evaluation.
Hawthorne Effect underlines measurement challenges: observation itself can change behaviour.
Impossible experiments remind economists to exploit ingenuity (DiD, instrumental variables, regression discontinuity) for causal inference while honouring ethical constraints.

Key Formulas Recap (LaTeX Notation)

Differences Estimator: $\beta_1 = E[OUTCOME|TREAT=1] - E[OUTCOME|TREAT=0]$ under perfect randomization.
Difference-in-Differences Estimator:
$\beta1 = \big(\overline{\Delta OUTCOME}{\text{treat}}\big) - \big(\overline{\Delta OUTCOME}_{\text{control}}\big)$
Fixed Effects Transformation (entity de-mean):
$Y{it}-\bar{Y}{i}=\beta1 \big(X{it}-\bar{X}{i}\big)+\big(\varepsilon{it}-\bar{\varepsilon}_{i}\big)$

Study Tips & Connections

Link back to earlier chapters on classical assumptions (endogeneity) and dummy variables (intercept shifts).
Compare RAEs/Quasi-experiments with Instrumental Variables: both aim to recover unconfounded variation.
When reading empirical papers, always ask: What is the identification strategy? Is the identifying assumption plausible? Would FE or RE handle omitted unobservables?
Practitioners frequently combine tools (e.g., DiD with FE, two-way FE models) to strengthen causal claims.