CV

Case-Control Study 1 – Comprehensive Notes

Learning Objectives

  • Define and explain distinguishing features of a case-control study.
  • Identify epidemiologic questions suited to case-control design.
  • State the purpose of controls and principles of valid control selection.
  • List sources of controls with their strengths and weaknesses.
  • Recognize biases of particular concern (selection bias, recall bias) and give examples.
  • Describe strengths and limitations of case-control studies.

Fundamental Design & Logic

  • Subjects selected on the basis of outcome status.
    • Cases = individuals with disease/outcome of interest.
    • Controls = individuals without disease/outcome.
  • Study looks backward from effect (disease) to presumed cause (past exposure).
  • Core analytic step: compare prior exposure histories of cases vs controls.
  • If the proportion exposed is the same in both groups → no association.
  • If exposure is more common in casespositive association / risk factor.

Key Terminology

  • Source population: population that gave rise to the cases; controls must represent its exposure distribution.
  • Incident case: new, first-ever diagnosis within a defined time window.
  • Prevalent case: existing diagnosis, regardless of onset time.

Selection of Cases

  • Begin with a clear, operational case definition to avoid false cases.
  • Ideal: enroll all incident cases in a defined population during a specified period.
  • Not mandatory to capture every case, but every true case should have an equal probability of selection (represent disease spectrum).
  • Eligibility criteria must be explicit; consider external validity.

Incident vs Prevalent Cases

  • Incident cases
    • Better represent entire spectrum.
    • Fresher memory → better exposure recall.
    • May require long accrual time if disease is rare.
  • Prevalent cases
    • Faster accrual.
    • Danger of survival bias (over-representation of long-term, less-severe survivors).

Possible Sources of Cases

  • Hospital or clinic registries.
  • All cases in a geographic area.
  • Population-based registries (e.g., Florida Cancer Data System).
  • Pre-defined cohorts (industrial workforce, insurance plan, profession).
  • Surveillance systems capturing new incident cases.

Published Examples (Cases)

  • Women with first primary breast cancer n=4{,}083 (1966-1989).
  • Swedish Cancer Registry: 1{,}500 lung/bronchus cancers (age 35–74, 1980-84).
  • All adult in-patient trauma deaths in California n=3{,}074.

Selection of Controls – Principles

  1. Major determinant of validity; controls provide the exposure distribution of the source population.
  2. Should be representative of the source population that produced the cases.
    • Best practice: random sample from that population.
  3. Ideally similar to cases in every way except disease status.
    • If cases are community-based → community controls.
    • If cases are hospital-based → hospital controls.
  4. Ascertain exposure information with equal accuracy in both groups.
  5. May use matching to control confounders; can also employ multiple control groups to raise power.

Information Quality

  • Cases often recall exposures better (illness salience) → recall bias.
  • Death of cases can force reliance on proxies (relatives) → info asymmetry.

Sources & Types of Controls

1. Population-Based Controls

  • Drawn directly from the community (random-digit dialing, voter lists, driver licenses).
  • Advantages
    • High likelihood of coming from the same source population.
  • Disadvantages
    • Costly, time-consuming, low participation; may have poorer recall.

2. Hospital Controls

  • Patients in the same hospital with diseases unrelated to exposure under study.
  • Advantages
    • Easy to identify; cheaper; similar recall accuracy (all ill); high participation.
    • Similar referral patterns mitigate selection factors.
  • Disadvantages
    • Illness profile may distort exposure prevalence; hospital catchment areas vary by disease.

3. Special Group Controls (friends, spouses, siblings)

  • Advantages
    • Healthy; cooperative; control for socioeconomic or genetic confounders.
  • Disadvantages
    • If they share the same exposures, association may be underestimated.

Comparative Trade-Offs

  • Need balance among: resemblance to cases, likelihood of participation, unbiased exposure representation.

Matching & Multiple Controls

  • Matching: select controls with same distribution of confounders (e.g., age, sex).
    • Prevents confounding but precludes assessing matched variable’s effect.
  • Multiple controls: enhances statistical power; can mix source types (e.g., population + hospital).

Biases of Particular Concern

  • Selection Bias
    • Arises when controls are not representative of source population or when case inclusion depends on exposure.
  • Recall Bias
    • Differential memory of past exposures between cases and controls.

Illustrative Questions & Answers

  • Study: Smoking → Myocardial Infarction (MI).
    • Are hospital patients with respiratory diseases good controls? → No (their admission relates to exposure—smoking—so smoking prevalence is inflated).
    • Good hospital control = patient group admitted for conditions unrelated to smoking (e.g., broken leg).
  • General rule: Hospital controls should be admitted for reasons unrelated to the risk factor.

Published Examples (Controls)

  • Random-digit-dialing controls frequency-matched on age.
  • Hospitalized patients 40–69 with non-malignant, drug-unrelated conditions.
  • White women in surgical/orthopedic services with no hip fracture history (hip fracture study).
  • Four controls per primary pulmonary hypertension patient chosen from same GP’s patient list.

Strengths of Case-Control Studies

  • Efficient for rare diseases or diseases with long latency.
  • Can evaluate multiple exposures for a single outcome.
  • Faster and cheaper than cohort studies.

Limitations

  • Cannot directly estimate incidence or risk; rely on odds ratio as effect measure.
  • Vulnerable to selection and recall bias.
  • Temporal relationship sometimes hard to establish (did exposure precede disease?).
  • Not ideal for rare exposures.

Practical & Ethical Notes

  • Clear eligibility & confidentiality protocols needed when accessing registries/hospital data.
  • Matching must respect ethical constraints (e.g., not withholding treatment).

Reminders & Further Study

  • Review Gordis, Chapter 7 (pp. 157–171) & Chapter 12 (pp. 245–253).
  • Complete Exercise 6 and RP3 as scheduled.
  • Next lecture: Case-Control Study 2.