Error and Bias Notes

A study might suggest an exposure is associated with a disease, but it's crucial to determine if the association is real or due to error.

Random Error: Error in measurement typically caused by factors that vary from one measurement to another, due to chance or 'noise'.
Systematic Error (Bias): Error in the design, implementation, or analysis of a study.

Estimating the mean weight of a freshman UCSD class by enrolling only 6 subjects may not be accurate due to random variation.

Measurement Error: Variability or inaccuracy in measuring variables or outcomes. For example, using different scales to measure blood pressure.
Sampling Error: Arises when a sample isn't representative of the larger population. It occurs due to natural variability within a population.
Data Entry Error: Mistakes made during data entry, such as mistyping values, which introduce variability.
Observer Bias: Systematic differences in how observers interpret or record information, leading to random errors.
Timing Variability: Natural fluctuations in disease occurrence or biological measurements over time.
Biological Variation: Inherent differences within individuals or populations that can introduce random error when measuring biological markers.

Selection Bias: Systematic difference between the characteristics of selected study participants and the target population.
- Example: A smoking prevalence study recruits participants from a community with a higher smoking rate, overestimating the prevalence in the general population.
Information Bias: Systematic difference in how data is collected, recorded, or reported, leading to misclassification of exposure or outcome variables.
- Example: Individuals with a certain disease are more likely to recall exposure to a specific risk factor, introducing bias.
Confounding: The relationship between exposure and outcome is distorted by a third variable associated with both.
- Example: In a study of coffee consumption and heart disease, age could be a confounder.

Selection Bias Question: Is this study biased because of who is participating?
Information Bias Question: Is this study biased because of the validity of the information collected?

Occurs when individuals self-select to participate, resulting in a non-representative sample.

Arises when there is a systematic difference in how cases and controls are selected.
Leads to a biased representation of the target population.
Distorts the estimation of association between exposure and outcome.

Arises when healthier or more conscientious participants are more likely to enroll, overestimating intervention effectiveness.

Occurs when participants drop out, and their characteristics differ from those who remain.
Distorts the results.

Systematic differences in data collection can lead to misclassification of exposure or disease.
- Non-differential: Degree of misclassification is similar between groups.
- Differential: Degree of misclassification differs between groups.
Occurs after recruitment.

The percentage of errors is about equal in the groups being compared.
If there is a real association, non-differential misclassification makes the groups appear more similar, leading to an underestimation of the association (bias toward the null).

A case-control study used diagnostic codes to estimate the association between diabetes and risk of coronary heart disease (CHD).
Diabetes was under-reported by about 50%.
OR = (40/60)/(10/90) = 6
OR = (20/80)/(5/95) = 5

Equally inaccurate memory of exposures in both groups.
- Example: Difficulty accurately remembering exercise frequency, duration, and intensity.
Recording and coding errors in records and databases.
- Example: Using ICD-11 system.
Using surrogate measures of exposure.
- Example: Prescriptions for anti-hypertensive medications.
Non-specific or broad definitions of exposure or outcome.
- Example: Studying effects of environmental tobacco smoke exposure.

Recall Bias: Individuals with certain characteristics may have differential recall of exposure or outcome information.
- Example: Smokers may be more likely to accurately recall their smoking history.

Surrogate Bias: Proxy respondents may vary in accuracy depending on their relationship to the participant.

Interviewer/Observer Bias: Different observers may have different levels of skill or subjectivity.
- Example: More experienced surgeons may result in differential misclassification of outcomes.

Reporting/Measurement Bias: Systematic differences in the accuracy or quality of reported or recorded information.
May occur because subjects are reluctant to report an exposure due to attitudes, beliefs, and perceptions.

Surveillance/Diagnostic Bias: Diagnostic criteria or methods differ between groups.
One group receives more thorough diagnostic procedures.