L2 - Measurement Error, Bias & Confounding

Measuring Exposures

  • First step in any public-health study: decide how an exposure will be operationalised and captured.
    • Questionnaire
    • Self-administered vs. interviewer-administered.
    • Consider differences in response patterns, data quality, feasibility, scalability.
    • Biological sampling
    • Blood, urine, hair.
    • Radiology / imaging.
    • Physical examination (e.g. height, weight).
    • Abstraction of previous medical records.
    • Environmental sampling (air, water, soil, noise, etc.).
    • Biomarkers & genetic markers
    • Example: School of Public Health pharmacogenomic screening group (Prof. Paul McKay).
    • Cheek-swab panel → tests multiple genetic variants to predict medication response.
  • Core research question: “Is there an association? Is there a possible cause-and-effect?”
    • To answer, we must judge whether findings are due to ▸ bias ▸ confounding ▸ chance ▸ a true causal relationship.

Measurement Error & Imprecision

  • Virtually all measurements include error.
  • Mathematical representation:
    • \text{Measured score}=\text{True score}\pm \text{Error}
    • Error contributes to \sigma (standard deviation) of the measure.
  • Two overarching categories of error:
    1. Bias (systematic error) – consistent deviation in one direction.
    2. Random error / imprecision – non-systematic fluctuation around the true value.

Bias: Definition & Importance

  • Bias = systematic error that leads to incorrect estimate of the exposure–outcome relationship.
    • Causes the difference between study result and true population value.
    • Repeating the study under identical biased conditions will reproduce the wrong answer.
  • Risk-of-bias assessment / critical appraisal = formal process of identifying & minimising these errors.
  • Eliminating bias protects against:
    • Inappropriate influence of funders.
    • Co-interventions & contamination of results.
    • Selective reporting of subgroups.
    • Baseline imbalances in important factors.

Taxonomy of Bias Covered in the Course

  • Selection bias
    • Sub-types: ascertainment bias, detection bias.
    • Occurs when selected participants are not representative of the target population.
    • Example: Choosing a new-treatment group that is inherently healthier → overestimates treatment benefit.
  • Recall bias
    • Archetypal in case–control studies.
    • Participants with an event (e.g. stroke) may recall past exposures more (or less) accurately than controls.
    • Not intentional; arises from differential memory.
  • Measurement (information) bias
    • Systematic errors in how data are collected or instruments are calibrated.
  • Publication bias
    • Studies with significant or positive findings are more likely to be published.

Confounding

  • Literal meaning: “mixed together.”
  • Formal definition: A non-causal association between exposure (E) and outcome (O) produced by a third variable (C) that is related to both E and O.
    • C must not be an intermediate step on the causal pathway.

Coffee–Heart-Disease Example

  • Observation: Coffee drinkers appear to have higher heart-disease rates than non-drinkers.
  • Hypothesis: “Coffee causes heart disease.”
  • Investigation reveals third variable smoking:
    • Smokers drink more coffee and smoking causes heart disease.
    • Association E\rightarrow O was distorted: estimate contained contribution from both coffee and smoking.
  • After removing/adjusting for smokers → no coffee–heart-disease association, aligning with wider literature.

Strategies to Control Confounding

Design Stage (pre-data-collection)

  • Randomisation
    • Equal distribution of both known & unknown confounders across groups.
  • Restriction
    • Exclude participants with the confounder (e.g. analyse only non-smokers).
  • Matching
    • Select controls with identical values of confounders as cases.
    • Downsides: expensive, time-consuming, and matched variables can no longer be analysed as exposures.

Analysis Stage (post-data-collection)

  • Stratification
    • Separate analysis within levels of the confounder (e.g. smokers vs. non-smokers).
  • Multivariable statistical modelling
    • Include multiple covariates in regression to adjust simultaneously.
    • Allows exploration of numerous potential risk factors while preserving the “full picture.”
    • Example: Late antenatal care among Aboriginal infants investigated with up to 9 risk factors; modelling allowed stepwise identification of the dominant contributors without discarding information.

Practical & Ethical Implications

  • Good study design prevents erroneous public-health recommendations.
  • Failure to recognise bias or confounding can:
    • Waste resources.
    • Lead to ineffective or harmful interventions.
    • Undermine trust in research.
  • Continuous cycle: Design ↔ Measurement ↔ Analysis ↔ Interpretation must address error at every step.

Key Take-Home Equations & Concepts

  • Measurement model: \text{Observed}=\text{True}\pm \varepsilon
  • Bias ≠ random error; elimination of one does not automatically remove the other.
  • Confounder criteria:
    1. Associated with the exposure.
    2. Independent risk factor for the outcome.
    3. Not an intermediate variable in the causal pathway.
  • Control options summary:
    • \text{Design stage}\rightarrow {\text{randomisation, restriction, matching}}
    • \text{Analysis stage}\rightarrow {\text{stratification, multivariable models}}

Connections to Previous & Future Lectures

  • Builds on foundational principles of epidemiologic study design (cohort, case–control, RCT).
  • Will dovetail with upcoming sessions on statistical inference, validity vs. reliability, and causal diagrams (DAGs).
  • Lays ethical groundwork for transparent reporting (CONSORT, STROBE) and systematic-review methodology.