causal interference 3

Causal Inference with Observational Data

Part 3: Difference-in-Differences & Propensity Score Matching

Overview
  • Focus on causal inference methods: Difference-in-Differences (DiD) and Propensity Score Matching (PSM).

  • RCTs (Randomized Controlled Trials) considered the gold standard for establishing causal relationships.

The Challenge of Causal Inference

  • RCTs Limitations:
      - Too expensive or logistically complex.
      - Ethical concerns surrounding withholding treatment from control groups.
      - Situations where treatment has already occurred requiring retrospective analysis.
      - Lack of control over assignment of treatment to subjects.

  • Observational Data:
      - Utilizes data from normal business operations as an alternative to RCTs.
      - Involves untestable assumptions that can complicate analysis.
      - The results depend heavily on the plausibility of these assumptions.
      - Must maintain transparency about underlying assumptions to ensure credibility.

Useful Techniques for Observational Causal Inference

  1. Difference-in-Differences (DiD): Compare changes over time between treated and control groups.

  2. Propensity Score Matching (PSM): Match treated units with similar untreated units.

  3. Instrumental Variables (IV): Use an external variable to isolate causal variation.

  4. Regression Discontinuity (RDD): Exploit a cutoff rule to identify causal effects.

Method 1: Difference-in-Differences (DiD)

Definition
  • DiD compares changes over time across a treatment group and a control group.

How Difference-in-Differences Works
Requirements:
  1. Pre- and post-treatment outcome data.

  2. Presence of both a treatment group and a control group.

  3. Parallel Trends Assumption must hold.

Logic of DiD:
  • The formula to compute DiD:
    DiD=(Treatment AfterTreatment Before)(Control AfterControl Before)\text{DiD} = (\text{Treatment After} - \text{Treatment Before}) - (\text{Control After} - \text{Control Before})

  • The first difference eliminates pre-existing disparities between the groups, while the second difference accounts for common time trends. The remaining difference indicates the treatment effect.

DiD Example: ISO Certification & Output
  • Research Question: Did obtaining ISO certification result in increased output for firms?

  • Data Collection:
      - Before (T=0) and After (T=1) Output Measurements:
        - Treatment Group (ISO certified): Before = 600, After = 800 (Change = +200).
        - Control Group (Not certified): Before = 300, After = 400 (Change = +100).

  • Calculation of Treatment Effect:
    DiD=(800600)(400300)=200100=100\text{DiD} = (800 - 600) - (400 - 300) = 200 - 100 = 100

  • Conclusion: Treatment effect = +100 units.

The Key Assumption: Parallel Trends
  • Definition: In the absence of treatment, the trends of both groups would reflect similar changes over time.

  • Important: While discrepancies in initial levels between groups are acceptable, it is critical that both groups exhibit parallel trends post-treatment.

Method 2: Propensity Score Matching (PSM)

Purpose of Matching
  • Context Example: A loyalty program's impact on customer spending.
      - Selection Bias Issue: Joining customers may inherently spend more or display higher loyalty, leading to biased comparisons.
      - Naïve Comparison Outcome: Directly comparing members but arriving at inflated conclusions (e.g., +$100) due to selection biases.
      - Post-Matching Outcome: Similar customers yield a more credible estimate (e.g., +$35).

The Propensity Score
  • Definition: The propensity score denotes the likelihood of receiving treatment based on observable characteristics (e.g., income, past spending).

  • Range: From 0 (unlikely to join) to 1 (very likely to join).

  • Function: Condenses multiple observable variables into a single score for matching.

  • Concrete Example:
      - Customer A (joined): Income = $5,000, Past Spend = $250, Score = 0.85.
      - Customer B (joined): Income = $2,000, Past Spend = $100, Score = 0.30.

Finding Suitable Matches
  • Approach: For each treated individual, locate a similar untreated individual based on propensity scores.
      - Example Matches:
        - Customer A (score 0.85) matched with Customer C (not joined): Income = $3,000, Score = 0.80.
        - Customer B (score 0.30) matched with Customer D (not joined): Income = $1,800, Score = 0.25.

  • Importance: Matching ensures fair comparisons between similar treatment profiles.

Matching in Action
  • Example Outcomes:
      - Match 1: A ↔ C: Spending after program = $320 vs $280; Treatment Effect = +$40.
      - Match 2: B ↔ D: Spending after program = $150 vs $120; Treatment Effect = +$30.

  • Average Treatment Effect Calculation:
    Average Treatment Effect (Matched)=40+302=35\text{Average Treatment Effect (Matched)} = \frac{40 + 30}{2} = 35

PSM: Key Takeaways
  1. Propensity Score: The likelihood of receiving treatment based on observable characteristics.

  2. Match Similar Individuals: Pair treated individuals with untreated ones based on similar propensity scores for credible comparisons.

  3. Outcome Analysis: Differences between matched pair outcomes provide an estimate of the treatment effect.

  4. Limitations::
       - PSM exclusively balances observable characteristics. Hidden biases due to unobservable factors remain unresolved.

DiD vs PSM: When to Use Which?

Comparison Table

Criteria

Difference-in-Differences (DiD)

Propensity Score Matching (PSM)

Data Needed

Pre & post data for both groups

Cross-sectional or panel data

Key Assumption

Parallel trends

Selection on observables

Handles

Time-invariant confounders

Observable confounders

Cannot Handle

Time-varying confounders

Unobservable confounders

Best When

Policy change occurs at a known time

Treatment is self-selected based on observables

Final Notes
  • Neither DiD nor PSM is a magical solution. It’s crucial to transparently disclose the fundamental assumptions underlying analyses, and acknowledge potential unobservables that influence results.

Key Takeaways

  1. RCTs: Still the gold standard but may be impractical in various scenarios; observational methods are crucial alternatives.

  2. DiD: Removes confounders through time variation, relying heavily on the parallel trends assumption to estimate effects.

  3. PSM: Uses matching to ensure fair comparisons, condensing variates into a single propensity score for analysis.

  4. Underlying Assumptions: Both methods require strong assumptions whose violations can diminish causal claims.

  5. Transparency: Essential to articulate assumptions and identify potential unseen influences on causal analyses.