Notes on Correlation, Causation, and the Third Variable Problem
Correlation vs Causation
- Headlines with bold terms imply causation, but they actually reflect correlation only.
- Correlation means two variables change together; it does not prove one causes the other.
- Predicting one event from another does not establish causality.
- Causality requires ruling out alternative explanations; simple correlation provides no such proof.
Third Variable Problem (Confounds)
- A third, unmeasured variable may account for the observed correlation.
- These variables are called confounds or the third-variable problem.
- When a confound exists, the relation between the two observed variables may be spurious.
Example: Ice Cream vs Violent Crimes
- Observed correlation: r=+0.50 (positive correlation)
- Interpreting as causation (e.g., "Ice Cream Consumption Leads to Violence") is not justified.
- Possible third variable: factors like weather/season that increase both ice cream sales and crime.
- Key point: correlation does not equal causation; causality requires controlled methods or experimental evidence.
Key Takeaways
- Correlation does not imply causation.
- Always consider potential confounds/third variables.
- Use experiments or statistical controls to infer causal relationships.