causal interference 3
Causal Inference with Observational Data
Part 3: Difference-in-Differences & Propensity Score Matching
Overview
Focus on causal inference methods: Difference-in-Differences (DiD) and Propensity Score Matching (PSM).
RCTs (Randomized Controlled Trials) considered the gold standard for establishing causal relationships.
The Challenge of Causal Inference
RCTs Limitations:
- Too expensive or logistically complex.
- Ethical concerns surrounding withholding treatment from control groups.
- Situations where treatment has already occurred requiring retrospective analysis.
- Lack of control over assignment of treatment to subjects.Observational Data:
- Utilizes data from normal business operations as an alternative to RCTs.
- Involves untestable assumptions that can complicate analysis.
- The results depend heavily on the plausibility of these assumptions.
- Must maintain transparency about underlying assumptions to ensure credibility.
Useful Techniques for Observational Causal Inference
Difference-in-Differences (DiD): Compare changes over time between treated and control groups.
Propensity Score Matching (PSM): Match treated units with similar untreated units.
Instrumental Variables (IV): Use an external variable to isolate causal variation.
Regression Discontinuity (RDD): Exploit a cutoff rule to identify causal effects.
Method 1: Difference-in-Differences (DiD)
Definition
DiD compares changes over time across a treatment group and a control group.
How Difference-in-Differences Works
Requirements:
Pre- and post-treatment outcome data.
Presence of both a treatment group and a control group.
Parallel Trends Assumption must hold.
Logic of DiD:
The formula to compute DiD:
The first difference eliminates pre-existing disparities between the groups, while the second difference accounts for common time trends. The remaining difference indicates the treatment effect.
DiD Example: ISO Certification & Output
Research Question: Did obtaining ISO certification result in increased output for firms?
Data Collection:
- Before (T=0) and After (T=1) Output Measurements:
- Treatment Group (ISO certified): Before = 600, After = 800 (Change = +200).
- Control Group (Not certified): Before = 300, After = 400 (Change = +100).Calculation of Treatment Effect:
Conclusion: Treatment effect = +100 units.
The Key Assumption: Parallel Trends
Definition: In the absence of treatment, the trends of both groups would reflect similar changes over time.
Important: While discrepancies in initial levels between groups are acceptable, it is critical that both groups exhibit parallel trends post-treatment.
Method 2: Propensity Score Matching (PSM)
Purpose of Matching
Context Example: A loyalty program's impact on customer spending.
- Selection Bias Issue: Joining customers may inherently spend more or display higher loyalty, leading to biased comparisons.
- Naïve Comparison Outcome: Directly comparing members but arriving at inflated conclusions (e.g., +$100) due to selection biases.
- Post-Matching Outcome: Similar customers yield a more credible estimate (e.g., +$35).
The Propensity Score
Definition: The propensity score denotes the likelihood of receiving treatment based on observable characteristics (e.g., income, past spending).
Range: From 0 (unlikely to join) to 1 (very likely to join).
Function: Condenses multiple observable variables into a single score for matching.
Concrete Example:
- Customer A (joined): Income = $5,000, Past Spend = $250, Score = 0.85.
- Customer B (joined): Income = $2,000, Past Spend = $100, Score = 0.30.
Finding Suitable Matches
Approach: For each treated individual, locate a similar untreated individual based on propensity scores.
- Example Matches:
- Customer A (score 0.85) matched with Customer C (not joined): Income = $3,000, Score = 0.80.
- Customer B (score 0.30) matched with Customer D (not joined): Income = $1,800, Score = 0.25.Importance: Matching ensures fair comparisons between similar treatment profiles.
Matching in Action
Example Outcomes:
- Match 1: A ↔ C: Spending after program = $320 vs $280; Treatment Effect = +$40.
- Match 2: B ↔ D: Spending after program = $150 vs $120; Treatment Effect = +$30.Average Treatment Effect Calculation:
PSM: Key Takeaways
Propensity Score: The likelihood of receiving treatment based on observable characteristics.
Match Similar Individuals: Pair treated individuals with untreated ones based on similar propensity scores for credible comparisons.
Outcome Analysis: Differences between matched pair outcomes provide an estimate of the treatment effect.
Limitations::
- PSM exclusively balances observable characteristics. Hidden biases due to unobservable factors remain unresolved.
DiD vs PSM: When to Use Which?
Comparison Table
Criteria | Difference-in-Differences (DiD) | Propensity Score Matching (PSM) |
|---|---|---|
Data Needed | Pre & post data for both groups | Cross-sectional or panel data |
Key Assumption | Parallel trends | Selection on observables |
Handles | Time-invariant confounders | Observable confounders |
Cannot Handle | Time-varying confounders | Unobservable confounders |
Best When | Policy change occurs at a known time | Treatment is self-selected based on observables |
Final Notes
Neither DiD nor PSM is a magical solution. It’s crucial to transparently disclose the fundamental assumptions underlying analyses, and acknowledge potential unobservables that influence results.
Key Takeaways
RCTs: Still the gold standard but may be impractical in various scenarios; observational methods are crucial alternatives.
DiD: Removes confounders through time variation, relying heavily on the parallel trends assumption to estimate effects.
PSM: Uses matching to ensure fair comparisons, condensing variates into a single propensity score for analysis.
Underlying Assumptions: Both methods require strong assumptions whose violations can diminish causal claims.
Transparency: Essential to articulate assumptions and identify potential unseen influences on causal analyses.