L2 - Measurement Error, Bias & Confounding
Measuring Exposures
- First step in any public-health study: decide how an exposure will be operationalised and captured.
- Questionnaire
- Self-administered vs. interviewer-administered.
- Consider differences in response patterns, data quality, feasibility, scalability.
- Biological sampling
- Blood, urine, hair.
- Radiology / imaging.
- Physical examination (e.g. height, weight).
- Abstraction of previous medical records.
- Environmental sampling (air, water, soil, noise, etc.).
- Biomarkers & genetic markers
- Example: School of Public Health pharmacogenomic screening group (Prof. Paul McKay).
- Cheek-swab panel → tests multiple genetic variants to predict medication response.
- Core research question: “Is there an association? Is there a possible cause-and-effect?”
- To answer, we must judge whether findings are due to ▸ bias ▸ confounding ▸ chance ▸ a true causal relationship.
Measurement Error & Imprecision
- Virtually all measurements include error.
- Mathematical representation:
- \text{Measured score}=\text{True score}\pm \text{Error}
- Error contributes to \sigma (standard deviation) of the measure.
- Two overarching categories of error:
- Bias (systematic error) – consistent deviation in one direction.
- Random error / imprecision – non-systematic fluctuation around the true value.
Bias: Definition & Importance
- Bias = systematic error that leads to incorrect estimate of the exposure–outcome relationship.
- Causes the difference between study result and true population value.
- Repeating the study under identical biased conditions will reproduce the wrong answer.
- Risk-of-bias assessment / critical appraisal = formal process of identifying & minimising these errors.
- Eliminating bias protects against:
- Inappropriate influence of funders.
- Co-interventions & contamination of results.
- Selective reporting of subgroups.
- Baseline imbalances in important factors.
Taxonomy of Bias Covered in the Course
- Selection bias
- Sub-types: ascertainment bias, detection bias.
- Occurs when selected participants are not representative of the target population.
- Example: Choosing a new-treatment group that is inherently healthier → overestimates treatment benefit.
- Recall bias
- Archetypal in case–control studies.
- Participants with an event (e.g. stroke) may recall past exposures more (or less) accurately than controls.
- Not intentional; arises from differential memory.
- Measurement (information) bias
- Systematic errors in how data are collected or instruments are calibrated.
- Publication bias
- Studies with significant or positive findings are more likely to be published.
Confounding
- Literal meaning: “mixed together.”
- Formal definition: A non-causal association between exposure (E) and outcome (O) produced by a third variable (C) that is related to both E and O.
- C must not be an intermediate step on the causal pathway.
Coffee–Heart-Disease Example
- Observation: Coffee drinkers appear to have higher heart-disease rates than non-drinkers.
- Hypothesis: “Coffee causes heart disease.”
- Investigation reveals third variable smoking:
- Smokers drink more coffee and smoking causes heart disease.
- Association E\rightarrow O was distorted: estimate contained contribution from both coffee and smoking.
- After removing/adjusting for smokers → no coffee–heart-disease association, aligning with wider literature.
Strategies to Control Confounding
Design Stage (pre-data-collection)
- Randomisation
- Equal distribution of both known & unknown confounders across groups.
- Restriction
- Exclude participants with the confounder (e.g. analyse only non-smokers).
- Matching
- Select controls with identical values of confounders as cases.
- Downsides: expensive, time-consuming, and matched variables can no longer be analysed as exposures.
Analysis Stage (post-data-collection)
- Stratification
- Separate analysis within levels of the confounder (e.g. smokers vs. non-smokers).
- Multivariable statistical modelling
- Include multiple covariates in regression to adjust simultaneously.
- Allows exploration of numerous potential risk factors while preserving the “full picture.”
- Example: Late antenatal care among Aboriginal infants investigated with up to 9 risk factors; modelling allowed stepwise identification of the dominant contributors without discarding information.
Practical & Ethical Implications
- Good study design prevents erroneous public-health recommendations.
- Failure to recognise bias or confounding can:
- Waste resources.
- Lead to ineffective or harmful interventions.
- Undermine trust in research.
- Continuous cycle: Design ↔ Measurement ↔ Analysis ↔ Interpretation must address error at every step.
Key Take-Home Equations & Concepts
- Measurement model: \text{Observed}=\text{True}\pm \varepsilon
- Bias ≠ random error; elimination of one does not automatically remove the other.
- Confounder criteria:
- Associated with the exposure.
- Independent risk factor for the outcome.
- Not an intermediate variable in the causal pathway.
- Control options summary:
- \text{Design stage}\rightarrow {\text{randomisation, restriction, matching}}
- \text{Analysis stage}\rightarrow {\text{stratification, multivariable models}}
Connections to Previous & Future Lectures
- Builds on foundational principles of epidemiologic study design (cohort, case–control, RCT).
- Will dovetail with upcoming sessions on statistical inference, validity vs. reliability, and causal diagrams (DAGs).
- Lays ethical groundwork for transparent reporting (CONSORT, STROBE) and systematic-review methodology.