Error and Bias Notes

Error and Bias

Learning Objectives

  • Define random and systematic error.

  • Identify different types of biases.

  • Describe how bias can affect study results.

  • Understand ways to reduce bias in study design, implementation, and analysis.

Is the Association Real or an Error?

  • A study might suggest an exposure is associated with a disease, but it's crucial to determine if the association is real or due to error.

Types of Error

  • Random Error: Error in measurement typically caused by factors that vary from one measurement to another, due to chance or 'noise'.

  • Systematic Error (Bias): Error in the design, implementation, or analysis of a study.

Random Error Explained

  • Random error is variability or inaccuracy in measurement due to chance.

Example of Random Error
  • Estimating the mean weight of a freshman UCSD class by enrolling only 6 subjects may not be accurate due to random variation.

Examples of Random Error

  • Measurement Error: Variability or inaccuracy in measuring variables or outcomes. For example, using different scales to measure blood pressure.

  • Sampling Error: Arises when a sample isn't representative of the larger population. It occurs due to natural variability within a population.

  • Data Entry Error: Mistakes made during data entry, such as mistyping values, which introduce variability.

  • Observer Bias: Systematic differences in how observers interpret or record information, leading to random errors.

  • Timing Variability: Natural fluctuations in disease occurrence or biological measurements over time.

  • Biological Variation: Inherent differences within individuals or populations that can introduce random error when measuring biological markers.

Broad Categories of Systematic Error or Bias

  • Selection

  • Information

  • Confounding (to be discussed in the next lecture)

Examples of Systematic Error or Bias

  • Selection Bias: Systematic difference between the characteristics of selected study participants and the target population.

    • Example: A smoking prevalence study recruits participants from a community with a higher smoking rate, overestimating the prevalence in the general population.

  • Information Bias: Systematic difference in how data is collected, recorded, or reported, leading to misclassification of exposure or outcome variables.

    • Example: Individuals with a certain disease are more likely to recall exposure to a specific risk factor, introducing bias.

  • Confounding: The relationship between exposure and outcome is distorted by a third variable associated with both.

    • Example: In a study of coffee consumption and heart disease, age could be a confounder.

Selection vs. Information Bias

  • Selection Bias Question: Is this study biased because of who is participating?

  • Information Bias Question: Is this study biased because of the validity of the information collected?

Selection Bias Details

  • Occurs during the recruitment stage and/or retention process.

  • Individuals have different probabilities of being included in the study sample.

  • Difficult to correct for during analysis.

Selection Bias - Self-Selection/Non-Responder Bias

  • Occurs when individuals self-select to participate, resulting in a non-representative sample.

Selection Bias - Case or Control Selection Bias

  • Arises when there is a systematic difference in how cases and controls are selected.

  • Leads to a biased representation of the target population.

  • Distorts the estimation of association between exposure and outcome.

Selection Bias – Healthy User/Worker

  • Arises when healthier or more conscientious participants are more likely to enroll, overestimating intervention effectiveness.

Selection Bias – Differential Loss to Follow Up

  • Occurs when participants drop out, and their characteristics differ from those who remain.

  • Distorts the results.

Information Bias Details

  • Systematic differences in data collection can lead to misclassification of exposure or disease.

    • Non-differential: Degree of misclassification is similar between groups.

    • Differential: Degree of misclassification differs between groups.

  • Occurs after recruitment.

Information Bias – Non-Differential Misclassification

  • The percentage of errors is about equal in the groups being compared.

  • If there is a real association, non-differential misclassification makes the groups appear more similar, leading to an underestimation of the association (bias toward the null).

Information Bias – Non-Differential Classification Example

  • A case-control study used diagnostic codes to estimate the association between diabetes and risk of coronary heart disease (CHD).

  • Diabetes was under-reported by about 50%.

  • OR = (40/60)/(10/90) = 6

  • OR = (20/80)/(5/95) = 5

What Causes Non-Differential Misclassifications?

  • Equally inaccurate memory of exposures in both groups.

    • Example: Difficulty accurately remembering exercise frequency, duration, and intensity.

  • Recording and coding errors in records and databases.

    • Example: Using ICD-11 system.

  • Using surrogate measures of exposure.

    • Example: Prescriptions for anti-hypertensive medications.

  • Non-specific or broad definitions of exposure or outcome.

    • Example: Studying effects of environmental tobacco smoke exposure.

Information Bias – Differential Classification

  • Recall Bias: Individuals with certain characteristics may have differential recall of exposure or outcome information.

    • Example: Smokers may be more likely to accurately recall their smoking history.

Information Bias – Differential Classification

  • Surrogate Bias: Proxy respondents may vary in accuracy depending on their relationship to the participant.

Information Bias – Differential Classification

  • Interviewer/Observer Bias: Different observers may have different levels of skill or subjectivity.

    • Example: More experienced surgeons may result in differential misclassification of outcomes.

Information Bias – Differential Classification

  • Reporting/Measurement Bias: Systematic differences in the accuracy or quality of reported or recorded information.

  • May occur because subjects are reluctant to report an exposure due to attitudes, beliefs, and perceptions.

Information Bias – Differential Classification

  • Surveillance/Diagnostic Bias: Diagnostic criteria or methods differ between groups.

  • One group receives more thorough diagnostic procedures.