Case-Control Study 1 – Comprehensive Notes

Learning Objectives

Define and explain distinguishing features of a case-control study.
Identify epidemiologic questions suited to case-control design.
State the purpose of controls and principles of valid control selection.
List sources of controls with their strengths and weaknesses.
Recognize biases of particular concern (selection bias, recall bias) and give examples.
Describe strengths and limitations of case-control studies.

Fundamental Design & Logic

Subjects selected on the basis of outcome status.
- Cases = individuals with disease/outcome of interest.
- Controls = individuals without disease/outcome.
Study looks backward from effect (disease) to presumed cause (past exposure).
Core analytic step: compare prior exposure histories of cases vs controls.
If the proportion exposed is the same in both groups → no association.
If exposure is more common in cases → positive association / risk factor.

Key Terminology

Source population: population that gave rise to the cases; controls must represent its exposure distribution.
Incident case: new, first-ever diagnosis within a defined time window.
Prevalent case: existing diagnosis, regardless of onset time.

Selection of Cases

Begin with a clear, operational case definition to avoid false cases.
Ideal: enroll all incident cases in a defined population during a specified period.
Not mandatory to capture every case, but every true case should have an equal probability of selection (represent disease spectrum).
Eligibility criteria must be explicit; consider external validity.

Incident vs Prevalent Cases

Incident cases
- Better represent entire spectrum.
- Fresher memory → better exposure recall.
- May require long accrual time if disease is rare.
Prevalent cases
- Faster accrual.
- Danger of survival bias (over-representation of long-term, less-severe survivors).

Possible Sources of Cases

Hospital or clinic registries.
All cases in a geographic area.
Population-based registries (e.g., Florida Cancer Data System).
Pre-defined cohorts (industrial workforce, insurance plan, profession).
Surveillance systems capturing new incident cases.

Published Examples (Cases)

Women with first primary breast cancer n=4{,}083 (1966-1989).
Swedish Cancer Registry: 1{,}500 lung/bronchus cancers (age 35–74, 1980-84).
All adult in-patient trauma deaths in California n=3{,}074.

Selection of Controls – Principles

Major determinant of validity; controls provide the exposure distribution of the source population.
Should be representative of the source population that produced the cases.
- Best practice: random sample from that population.
Ideally similar to cases in every way except disease status.
- If cases are community-based → community controls.
- If cases are hospital-based → hospital controls.
Ascertain exposure information with equal accuracy in both groups.
May use matching to control confounders; can also employ multiple control groups to raise power.

Information Quality

Cases often recall exposures better (illness salience) → recall bias.
Death of cases can force reliance on proxies (relatives) → info asymmetry.

Sources & Types of Controls

1. Population-Based Controls

Drawn directly from the community (random-digit dialing, voter lists, driver licenses).
Advantages
- High likelihood of coming from the same source population.
Disadvantages
- Costly, time-consuming, low participation; may have poorer recall.

2. Hospital Controls

Patients in the same hospital with diseases unrelated to exposure under study.
Advantages
- Easy to identify; cheaper; similar recall accuracy (all ill); high participation.
- Similar referral patterns mitigate selection factors.
Disadvantages
- Illness profile may distort exposure prevalence; hospital catchment areas vary by disease.

3. Special Group Controls (friends, spouses, siblings)

Advantages
- Healthy; cooperative; control for socioeconomic or genetic confounders.
Disadvantages
- If they share the same exposures, association may be underestimated.

Comparative Trade-Offs

Need balance among: resemblance to cases, likelihood of participation, unbiased exposure representation.

Matching & Multiple Controls

Matching: select controls with same distribution of confounders (e.g., age, sex).
- Prevents confounding but precludes assessing matched variable’s effect.
Multiple controls: enhances statistical power; can mix source types (e.g., population + hospital).

Biases of Particular Concern

Selection Bias
- Arises when controls are not representative of source population or when case inclusion depends on exposure.
Recall Bias
- Differential memory of past exposures between cases and controls.

Illustrative Questions & Answers

Study: Smoking → Myocardial Infarction (MI).
- Are hospital patients with respiratory diseases good controls? → No (their admission relates to exposure—smoking—so smoking prevalence is inflated).
- Good hospital control = patient group admitted for conditions unrelated to smoking (e.g., broken leg).
General rule: Hospital controls should be admitted for reasons unrelated to the risk factor.

Published Examples (Controls)

Random-digit-dialing controls frequency-matched on age.
Hospitalized patients 40–69 with non-malignant, drug-unrelated conditions.
White women in surgical/orthopedic services with no hip fracture history (hip fracture study).
Four controls per primary pulmonary hypertension patient chosen from same GP’s patient list.

Strengths of Case-Control Studies

Efficient for rare diseases or diseases with long latency.
Can evaluate multiple exposures for a single outcome.
Faster and cheaper than cohort studies.

Limitations

Cannot directly estimate incidence or risk; rely on odds ratio as effect measure.
Vulnerable to selection and recall bias.
Temporal relationship sometimes hard to establish (did exposure precede disease?).
Not ideal for rare exposures.

Practical & Ethical Notes

Clear eligibility & confidentiality protocols needed when accessing registries/hospital data.
Matching must respect ethical constraints (e.g., not withholding treatment).

Reminders & Further Study

Review Gordis, Chapter 7 (pp. 157–171) & Chapter 12 (pp. 245–253).
Complete Exercise 6 and RP3 as scheduled.
Next lecture: Case-Control Study 2.