Causal Inference: Bias, Confounding, and Interaction

Occurs when subject selection leads to an apparent association between exposure and disease, even if no real association exists.
Can arise from:
- Nonresponse of potential study participants (if response rates differ based on exposure and disease status).
- Participant losses during follow-up in cohort studies (compare characteristics of those lost vs. not lost).
Selection of study population affects generalizability (external validity) but selection bias affects internal validity, leading to incorrect estimates of odds ratios (ORs) or relative risks (RRs).
Exclusion Bias: Applying different eligibility criteria to cases and controls.
Compensating Bias: Occurs when selection biases in cases and controls are of the same magnitude, leading to an unbiased OR.

Arises from flawed data collection, resulting in misclassification of exposure or disease status.
Types:
- Differential Misclassification: Misclassification rate differs between study groups, leading to spurious associations or masking real ones.
- Nondifferential Misclassification: Inaccuracy is equal across study groups, diluting the RR or OR towards 1.0.
- Surveillance Bias: Better disease ascertainment in a monitored population.
- Recall Bias: Cases recall exposures differently than controls.
- Reporting Bias: Subjects selectively report exposures.
- Wish Bias: Subjects unintentionally distort past exposures to align with their beliefs about disease causation.
Surrogate interviews may introduce inaccuracies in exposure data.

Confounding: A third factor (X) distorts the observed relationship between exposure (A) and disease (B).
- Factor X is a known risk factor for disease B
- Factor X is associated with exposure A, but is not a result of exposure A
Can address confounding in study design (matching) or data analysis (stratification, adjustment).
Stratification: Evaluating the association within subgroups (strata) of the confounding variable.
- Calculate the measure of association within each stratum of the confounding variable.
Confounding is a real phenomenon, not an error; failure to address it leads to biased study conclusions.

Interaction (effect modification): When the incidence rate of disease in the presence of two or more risk factors differs from the rate expected from their individual effects.
Positive interaction (synergism): Combined effect is greater than expected.
Negative interaction (antagonism): Combined effect is less than expected.
Models:
- Additive: Combined effect is the sum of individual effects.
- Multiplicative: Combined effect is the product of individual effects.
Assess interaction by comparing observed incidence with that predicted by additive or multiplicative models.
The choice between additive and multiplicative models should ideally be based on biologic knowledge.
Synergistic relationships may have practical policy implications.

Odds Ratio (OR): $OR = \frac{Odds\ of\ exposure\ in\ cases}{Odds\ of\ exposure\ in\ controls}$
Compensating Bias: When bias in selecting cases and controls is of the same magnitude.
Additive Model Calculation: Expected $RR = RR<em>A + RR</em>B - 1$ .
Multiplicative Model Calculation: Expected $RR = RR<em>A \times RR</em>B$ .