Principles of Epidemiology: Identification of Bias

Introduction

Epidemiologic studies compare disease frequency in two or more groups based on a characteristic or exposure.
- Exposed group: Individuals with the characteristic/exposure.
- Unexposed group: Individuals without the characteristic/exposure.

Measures of Disease Frequency

Absolute Comparison
- Risk or rate differences.
Relative Comparison
- Risk ratio, rate ratio, odds ratio.
The specific measure depends on the study design, data type, and goal of comparison.
Absolute measures describe the public health impact of an exposure.
Relative measures describe the strength of the causal relationship between exposure and disease.

Validity Assessment

After calculating the measure of association, assess the internal validity of the observed result.
A study is valid only when these three alternative explanations have been eliminated:
- Bias: A systematic error in the study design or conduct that leads to an erroneous association between exposure and disease.
- Confounding: The mixing of effects between the exposure, the disease, and a third variable (confounder). Confounding distorts the relationship between an exposure and disease.
- Random Error: The probability that the observed result is due to chance.

Internal Validity

If systematic bias, confounding, and random error are ruled out, the association is considered true (internal validity).
Only after establishing internal validity can the investigator assess whether the exposure caused the outcome.
Internal validity must be established before generalizing study results to broader populations.

Overview of Bias

Bias is a systematic error that results in an incorrect or invalid estimate of the measure of association.
Bias can be introduced at any stage of a study (design, data collection, analysis).
When evaluating bias, investigators must:
- Identify the source.
- Estimate its magnitude or strength.
- Assess its direction.

Key Facts About Bias

Bias is an alternative explanation for an association.
Bias does not necessarily result from investigator prejudice.
Bias can pull an association:
- Toward the null (underestimate).
- Away from the null (overestimate).
The amount of bias can be small, moderate, or large.
Bias is avoided with careful study design and conduct.

Selection Bias

Selection bias is an error due to systematic differences in characteristics between study participants and non-participants.
Results from procedures used to select study subjects, leading to results among participants that differ from those that would occur in the eligible but non-included individuals.
Occurs in a case-control study if selection of cases and controls is based on differing criteria related to exposure status.

Types of Selection Bias

Control selection bias: Selection of an inappropriate control group in case-control studies.
Self-selection bias: Refusal, nonresponse, or agreement to participate that is related to the exposure and disease.
Loss to follow-up: Loss to follow-up that is related to the exposure and disease.
Health-worker effect: Selection of the general population as a comparison group in an occupational cohort study.
Differential surveillance, diagnosis, or referral: Differential surveillance, diagnosis, or referral of study subjects according to their exposure and disease status.

Control Selection Bias Example

Case-control study evaluating the role of Pap smears in cervical cancer prevention.
- Cases: Newly diagnosed cervical cancer patients from hospital records.
- Controls: Women from the same neighborhood as cases, selected if they were at home.
- 250 cases and 250 controls selected.
Exposure: Pap smear within 1 year of diagnosis (cases) or index date (controls).
Initial finding: 40% of cases and 40% of controls had a recent Pap smear. Odds ratio (OR) = 1 (no association).
Selection bias: Only controls at home during recruitment were included.

Control Selection Bias Explained

Women at home are less likely to be employed and have regular medical check-ups.
The actual data:
- 40% of cases had a recent Pap smear.
- 60% of controls had a recent Pap smear.
True odds ratio (OR): $0.44$ (56% reduced risk of cervical cancer among women with recent Pap smears).
The initial results with control selection bias were biased toward the null, obscuring a true association.

Avoiding Control Selection Bias

Adhere to the purpose of the control group: sample the exposure distribution in the base population that gives rise to the cases.
Use identical selection criteria for cases and controls to ensure they come from the same source population.

Self-Selection Bias

Case-control studies with low/moderate participation rates are susceptible to self-selection bias.
Arises from:
- Refusal or non-response by participants that is related to both the exposure and disease.
- Agreement to participate that is related to both the exposure and disease.
A low/moderate participation rate doesn't necessarily result in selection bias if the reasons for participation are similar for exposed and unexposed cases and controls.
If subjects in a particular exposure-disease category (e.g., exposed cases) are more or less likely to participate, the observed measure of association will be biased.
Ensure high participation rates (80% or more) among both cases and controls.

Differential Surveillance, Diagnosis, or Referral

Selection bias in case-control studies due to differential surveillance related to the exposure.
Example: Case-control study on the risk of venous thromboembolism (VT) among oral contraceptive users.
- Cases: Women aged 20-44 hospitalized for VT.
- Controls: Women aged 20-44 hospitalized for an acute illness or elective surgery at the same hospitals.
- 72% of cases reported using oral contraceptives, compared to only 20% of controls.
- Investigators calculated a 10.2-fold increased risk of thromboembolism among current oral contraceptive users.

Differential Surveillance, Diagnosis, or Referral Explained

The high relative risk might be due to bias in hospital admission criteria.
Prior reports linked oral contraceptives and VT.
Healthcare providers were more likely to hospitalize women with thromboembolism symptoms who were taking oral contraceptives.
This tendency to hospitalize based on exposure status (selection bias) led to a stronger observed relationship than truly existed.

Selection Bias in Cohort Studies: Loss to Follow-Up

Loss to follow-up occurs when subjects can no longer be located or do not want to participate.
Reduces the study's power to detect true associations due to smaller sample size.
Can bias the study results if related to both exposure and outcome.
Non-differential loss to follow-up.
Differential loss to follow-up.

Observation Bias

Observation or information bias is a flaw in measuring exposure or outcome data, resulting in different data quality between comparison groups.
Arises from a systematic difference in how exposure or outcome is measured between compared groups.
Can occur in case-control studies if different techniques are used for interviewing cases and controls.
Can occur in cohort studies if different procedures are used to obtain information on exposed and unexposed.

Key Features of Observation Bias

Occurs after subjects have entered the study.
Pertains to how data are collected.
Often results in incorrect classification of participants as either exposed/unexposed or diseased/non-diseased.
Like selection bias, it can create bias toward or away from the null.
The direction depends on if the measurement error of exposure or disease depends on the other axis.

Recall Bias

Recall bias occurs when there is a differential level of accuracy in the information provided by compared groups.
Occurs in case-control studies if cases are more (or less) likely than controls to recall and report prior exposures.
Occurs in cohort studies if exposed subjects are more (or less) likely than unexposed subjects to remember and report subsequent diseases.
Differences in reporting are due to subjects' failure to report information rather than fabrication.

Differential Recall

Differential recall can bias the true measure of association toward or away from the null.
Direction depends on which subjects (cases versus controls or exposed versus unexposed subjects) have less accurate recall.
Typically described in the context of a traditional case-control study with non-diseased controls.
Example: Hypothetical case-control study of birth defects (cases: malformed infants, controls: healthy infants).
Exposure data is collected at postpartum interviews.

Recall Bias Example

Case mothers have been reviewing every illness, medication, alcoholic beverage consumed, and possible reason for their child's defect during pregnancy.
Control mothers of healthy infants have spent less time reviewing their prenatal activities.
Thus, the observed odds ratio is biased upwards.

Interviewer Bias

Systematic difference in soliciting, recording, or interpreting information in studies using in-person or telephone interviews.
Occurs in case-control studies when interviewers are aware of a subject's disease status and question cases and controls differently about their diseases.
Occurs in cohort and experimental studies when interviewers are aware of a subject's exposure/treatment status and query exposed/unexposed subjects differently about diseases.

Avoiding Interviewer Bias

Mask interviewers to the subject's disease (case-control) or exposure status (cohort/experimental).
Design standardized questionnaires with closed-ended, easy-to-understand questions and appropriate response options.
Understandable questions asked in exactly the same manner reduce interviewer bias.

Misclassification

Misclassification is a measurement error and the most common form of bias in epidemiologic research.
An error in exposure or disease classification.
- Exposed individual classified as unexposed (or vice versa).
- Diseased individual classified as non-diseased (or vice versa).
In case-control and retrospective cohort studies, relevant exposures may have occurred years before data collection, making accurate recall difficult.
Also occurs when broad exposure definitions are used.
Disease misclassification also occur as a result of a broad definition.
Two types:
- Differential.
- Non-differential.

Epidemiologic studies compare disease frequency between exposed and unexposed groups. Measures of disease frequency include absolute comparisons (risk or rate differences) and relative comparisons (risk ratios, rate ratios, odds ratios), depending on study design and goals. Internal validity is assessed by ruling out bias, confounding, and random error, which ensures a true association.

Bias, a systematic error, affects study results and can occur at any stage. It can lead to incorrect estimates of association strength and originates from various sources, including participant selection and measurement differences. Types of bias include selection bias (e.g., control selection bias, self-selection bias) and observation bias (e.g., recall bias, interviewer bias).

To minimize bias, carefully design studies, ensure high participant rates, and standardize data collection methods. Misclassification errors also pose a challenge, involving incorrect categorization of exposure or disease status, which can be differential or non-differential. Maintaining evidence of internal validity is essential before generalizing findings to broader populations.