Error and Bias Notes
Error and Bias
Learning Objectives
Define random and systematic error.
Identify different types of biases.
Describe how bias can affect study results.
Understand ways to reduce bias in study design, implementation, and analysis.
Is the Association Real or an Error?
A study might suggest an exposure is associated with a disease, but it's crucial to determine if the association is real or due to error.
Types of Error
Random Error: Error in measurement typically caused by factors that vary from one measurement to another, due to chance or 'noise'.
Systematic Error (Bias): Error in the design, implementation, or analysis of a study.
Random Error Explained
Random error is variability or inaccuracy in measurement due to chance.
Example of Random Error
Estimating the mean weight of a freshman UCSD class by enrolling only 6 subjects may not be accurate due to random variation.
Examples of Random Error
Measurement Error: Variability or inaccuracy in measuring variables or outcomes. For example, using different scales to measure blood pressure.
Sampling Error: Arises when a sample isn't representative of the larger population. It occurs due to natural variability within a population.
Data Entry Error: Mistakes made during data entry, such as mistyping values, which introduce variability.
Observer Bias: Systematic differences in how observers interpret or record information, leading to random errors.
Timing Variability: Natural fluctuations in disease occurrence or biological measurements over time.
Biological Variation: Inherent differences within individuals or populations that can introduce random error when measuring biological markers.
Broad Categories of Systematic Error or Bias
Selection
Information
Confounding (to be discussed in the next lecture)
Examples of Systematic Error or Bias
Selection Bias: Systematic difference between the characteristics of selected study participants and the target population.
Example: A smoking prevalence study recruits participants from a community with a higher smoking rate, overestimating the prevalence in the general population.
Information Bias: Systematic difference in how data is collected, recorded, or reported, leading to misclassification of exposure or outcome variables.
Example: Individuals with a certain disease are more likely to recall exposure to a specific risk factor, introducing bias.
Confounding: The relationship between exposure and outcome is distorted by a third variable associated with both.
Example: In a study of coffee consumption and heart disease, age could be a confounder.
Selection vs. Information Bias
Selection Bias Question: Is this study biased because of who is participating?
Information Bias Question: Is this study biased because of the validity of the information collected?
Selection Bias Details
Occurs during the recruitment stage and/or retention process.
Individuals have different probabilities of being included in the study sample.
Difficult to correct for during analysis.
Selection Bias - Self-Selection/Non-Responder Bias
Occurs when individuals self-select to participate, resulting in a non-representative sample.
Selection Bias - Case or Control Selection Bias
Arises when there is a systematic difference in how cases and controls are selected.
Leads to a biased representation of the target population.
Distorts the estimation of association between exposure and outcome.
Selection Bias – Healthy User/Worker
Arises when healthier or more conscientious participants are more likely to enroll, overestimating intervention effectiveness.
Selection Bias – Differential Loss to Follow Up
Occurs when participants drop out, and their characteristics differ from those who remain.
Distorts the results.
Information Bias Details
Systematic differences in data collection can lead to misclassification of exposure or disease.
Non-differential: Degree of misclassification is similar between groups.
Differential: Degree of misclassification differs between groups.
Occurs after recruitment.
Information Bias – Non-Differential Misclassification
The percentage of errors is about equal in the groups being compared.
If there is a real association, non-differential misclassification makes the groups appear more similar, leading to an underestimation of the association (bias toward the null).
Information Bias – Non-Differential Classification Example
A case-control study used diagnostic codes to estimate the association between diabetes and risk of coronary heart disease (CHD).
Diabetes was under-reported by about 50%.
OR = (40/60)/(10/90) = 6
OR = (20/80)/(5/95) = 5
What Causes Non-Differential Misclassifications?
Equally inaccurate memory of exposures in both groups.
Example: Difficulty accurately remembering exercise frequency, duration, and intensity.
Recording and coding errors in records and databases.
Example: Using ICD-11 system.
Using surrogate measures of exposure.
Example: Prescriptions for anti-hypertensive medications.
Non-specific or broad definitions of exposure or outcome.
Example: Studying effects of environmental tobacco smoke exposure.
Information Bias – Differential Classification
Recall Bias: Individuals with certain characteristics may have differential recall of exposure or outcome information.
Example: Smokers may be more likely to accurately recall their smoking history.
Information Bias – Differential Classification
Surrogate Bias: Proxy respondents may vary in accuracy depending on their relationship to the participant.
Information Bias – Differential Classification
Interviewer/Observer Bias: Different observers may have different levels of skill or subjectivity.
Example: More experienced surgeons may result in differential misclassification of outcomes.
Information Bias – Differential Classification
Reporting/Measurement Bias: Systematic differences in the accuracy or quality of reported or recorded information.
May occur because subjects are reluctant to report an exposure due to attitudes, beliefs, and perceptions.
Information Bias – Differential Classification
Surveillance/Diagnostic Bias: Diagnostic criteria or methods differ between groups.
One group receives more thorough diagnostic procedures.