Chapter 16 – Econometric Tools for Causal Inference: Random Assignment, Natural Experiments, and Panel Fixed/Random Effects

Random Assignment Experiments (RAEs)

  • Concept & Purpose
    • Employ an explicit experimental design based on random assignment to isolate causal effect of a treatment.
    • Gold‐standard approach in economics for inferring causality because—under ideal conditions—randomization ensures treatment status is uncorrelated with all other factors.
  • Canonical Two-Group Design
    • Step 1 – Recruit a sample of subjects.
    • Step 2 – Randomly allocate subjects into
    • Treatment group: receives the intervention.
    • Control group: receives a placebo / business-as-usual condition.
    • Expected Result: The only systematic difference between the groups is the treatment itself.
  • Baseline Specification (Differences Estimator)
    • OUTCOME<em>i=β</em>0+β<em>1TREATMENT</em>i+εiOUTCOME<em>i = \beta</em>0 + \beta<em>1\,TREATMENT</em>i + \varepsilon_i
    • $OUTCOME_i$ : measured result for individual $i$ (e.g., earnings, test score).
    • $TREATMENT_i$ : indicator (1 = treated, 0 = control).
    • $\beta_1$ : differences estimator; average treatment effect (ATE).
    • $\varepsilon_i$ : composite of all remaining unexplained factors.
  • Augmented Specification (Equation 16.2)
    • If important covariates are observable, include them to improve precision:
      OUTCOME<em>i=β</em>0+β<em>1TREATMENT</em>i+β<em>2X</em>1i++β<em>kX</em>ki+εiOUTCOME<em>i = \beta</em>0 + \beta<em>1\,TREATMENT</em>i + \beta<em>2 X</em>{1i} + \cdots + \beta<em>k X</em>{ki} + \varepsilon_i
    • Controls ($X{ji}$) soak up residual variance and reduce standard errors of $\beta1$.
  • Practical Limitations in Economics
    1. Non-random samples – volunteers or convenience samples may limit external validity.
    2. Unobservable heterogeneity – randomization balances in expectation, but with small samples imbalance is possible.
    3. Hawthorne Effect – subjects alter behaviour simply because they know they are studied.
    4. Impossible or unethical experiments – e.g., assigning unemployment shocks, changing taxation status.

Natural (Quasi-) Experiments

  • Definition & Rationale
    • Treatment and control status arise from an exogenous event (law change, acquisition, catastrophe) rather than researcher assignment.
    • Aim: Exploit real-world shocks that mimic randomization.
  • Key Methodological Twist
    • Analysts compare changes (pre- vs post-) rather than levels because initial group characteristics may differ.
  • Difference-in-Differences (DiD) Framework
    • Dependent variable becomes the individual-specific change:
      ΔOUTCOME<em>i=β</em>0+β<em>1TREATMENT</em>i+β<em>2X</em>1i++εi\Delta OUTCOME<em>i = \beta</em>0 + \beta<em>1\,TREATMENT</em>i + \beta<em>2 X</em>{1i} + \cdots + \varepsilon_i
    • $\beta_1$ : difference-in-differences estimator; difference between the treatment group’s change and control group’s change.
    • Identification assumption: Parallel Trends—in absence of treatment, average trends would be the same across groups.
  • Illustrative Example – ARCO–Thrifty Acquisition (1997)
    • Event: ARCO buys Thrifty Oil; fear of reduced competition.
    • Groups
    • Treatment: ARCO stations within 1 mile of a former Thrifty.
    • Control: ARCO stations with no nearby Thrifty competitor.
    • Data: Station-level prices before and after acquisition.
    • Finding: Treatment group’s post-acquisition prices rose relative to controls, implying anti-competitive effect.
    • Fig 16.1 (visual): Prices lower for treatment pre-merger, higher post-merger; underscores need for DiD rather than simple means comparison.

Panel (Longitudinal) Data

  • Definition
    • Repeated observations on the same cross-sectional units across two or more periods.
    • Combines advantages of time-series (dynamic patterns) and cross-sectional (individual variation) data.
  • Major Public Panel Surveys Mentioned
    • 1979 National Longitudinal Survey of Youth (NLSY‐79).
    • Panel Survey of Income Dynamics (PSID).
    • British Household Panel Survey.
    • Canadian National Public Health Survey.
  • Advantages
    1. Enables questions unanswerable with pure cross-sections (e.g., within-person change).
    2. Expands effective sample size via NT observations vs N.
  • Four Variable Types in Panels
    1. Between-entity-only: differ across units, constant over time (e.g., race, long-run geography).
    2. Time-only: vary over time, identical across entities (e.g., nationwide inflation rate in year $t$).
    3. Both vary: differ across units and time (e.g., income, employment status).
    4. Trend variables: deterministic functions of time (e.g., $t$, $t^2$).

Fixed Effects (FE) Model

  • Core Specification Y<em>it=β</em>0+β<em>1X</em>it+EF<em>i+TF</em>t+εitY<em>{it} = \beta</em>0 + \beta<em>1 X</em>{it} + EF<em>i + TF</em>t + \varepsilon_{it}
    • $EF_i$ : $N-1$ entity dummy variables (state, individual, firm) capturing time-invariant heterogeneity.
    • $TF_t$ : $T-1$ time dummies capturing shocks common to all units during period $t$ (macro conditions, policy environment).
  • Motivation
    • If underlying model without fixed effects is
      Y<em>it=β</em>0+β<em>1X</em>it+v<em>it,  v</em>it=a<em>i+z</em>t+ε<em>itY<em>{it} = \beta</em>0 + \beta<em>1 X</em>{it} + v<em>{it},\; v</em>{it}=a<em>i+z</em>t+\varepsilon<em>{it} where $ai$ (unit traits) or $zt$ (time shocks) correlate with $X{it}$, Classical Assumption III (zero correlation between explanatory variables and error) is violated.
    • FE sweeps out $ai$ and $zt$ via dummy variables, converting $v{it}$ into iid-style $\varepsilon{it}$.
  • Interpretation & Benefits
    • Within-entity estimator: identifies effect of $X_{it}$ from deviations around each unit’s own mean.
    • Corrects bias from omitted variables that do not change over time or affect all entities uniformly over time.
    • Cost: Consumes degrees of freedom; cannot estimate coefficients on time-invariant regressors.
  • Example – Death Penalty vs Murder Rate
    • Simple cross-section (1990):
      MRDRTE<em>i=α</em>0+α<em>1EXEC</em>i+uiMRDRTE<em>i = \alpha</em>0 + \alpha<em>1 EXEC</em>i + u_i
    • Suggested positive relationship—likely spurious (Fig 16.2).
    • FE panel using 1990 & 1993 data flips sign: executions associated with lower murder rates when unobserved, constant state factors are controlled (Fig 16.3).

Random Effects (RE) Model & FE vs RE Choice

  • RE Assumption
    • Entity intercepts are random draws from a common distribution centred at a grand mean; treated as part of the error component.
  • Advantages over FE
    1. More degrees of freedom—doesn’t estimate $N-1$ dummies.
    2. Can obtain coefficients for time-invariant variables (e.g., gender, region).
  • Crucial Requirement
    • The unobserved entity effect must be uncorrelated with all explanatory variables ($Cov(ai, X{it})=0$). Violation yields inconsistent RE estimates.
  • Decision Rule
    • Theoretical judgment: suspect correlation ⇒ use FE.
    • Empirical diagnostic: Hausman Test
    • Compares FE and RE coefficient vectors.
    • Null: No systematic difference (supports RE).
    • Reject ⇒ prefer FE.

Practical / Ethical / Philosophical Considerations

  • RAEs vs Natural Experiments illustrate trade-off between internal vs external validity.
  • FE/RE emphasise handling unobserved heterogeneity—central theme in econometrics and policy evaluation.
  • Hawthorne Effect underlines measurement challenges: observation itself can change behaviour.
  • Impossible experiments remind economists to exploit ingenuity (DiD, instrumental variables, regression discontinuity) for causal inference while honouring ethical constraints.

Key Formulas Recap (LaTeX Notation)

  • Differences Estimator: β1=E[OUTCOMETREAT=1]E[OUTCOMETREAT=0]\beta_1 = E[OUTCOME|TREAT=1] - E[OUTCOME|TREAT=0] under perfect randomization.
  • Difference-in-Differences Estimator:
    β<em>1=(ΔOUTCOME</em>treat)(ΔOUTCOMEcontrol)\beta<em>1 = \big(\overline{\Delta OUTCOME}</em>{\text{treat}}\big) - \big(\overline{\Delta OUTCOME}_{\text{control}}\big)
  • Fixed Effects Transformation (entity de-mean):
    Y<em>itYˉ</em>i=β<em>1(X</em>itXˉ<em>i)+(ε</em>itεˉi)Y<em>{it}-\bar{Y}</em>{i}=\beta<em>1 \big(X</em>{it}-\bar{X}<em>{i}\big)+\big(\varepsilon</em>{it}-\bar{\varepsilon}_{i}\big)

Study Tips & Connections

  • Link back to earlier chapters on classical assumptions (endogeneity) and dummy variables (intercept shifts).
  • Compare RAEs/Quasi-experiments with Instrumental Variables: both aim to recover unconfounded variation.
  • When reading empirical papers, always ask: What is the identification strategy? Is the identifying assumption plausible? Would FE or RE handle omitted unobservables?
  • Practitioners frequently combine tools (e.g., DiD with FE, two-way FE models) to strengthen causal claims.