LH

Reliability and Validity in Measurement

  • Introduction to Evidence in Practice

    • Focus on measuring constructs (e.g., pain, disability, strength).
    • Importance of choosing the right measure for each construct.
  • Reliability

    • Definition: The extent to which a measurement is free from error.
    • A reliable measure provides consistent results across repetitions.
    • Example: Measuring height multiple times with a tape measure yielding the same result indicates reliability.
    • Unreliable measures lead to inconsistent diagnoses or assessments (e.g., a patient diagnosed as both having and not having a condition).
    • Types of reliability:
    • Intrarater Reliability: Same rater measures a patient multiple times; consistency of scores indicates reliability.
    • Interrater Reliability: Different raters measure the same patient; consistency of scores across raters indicates reliability.
  • Validity

    • Definition: The extent to which a score truly reflects the construct it is measuring.
    • Clear for observable measures (e.g., height), but complex for latent constructs (e.g., pain, disability).
    • Example: Roland-Morris Disability Questionnaire assesses back pain-related disability; validity depends on whether questions accurately represent disability.
    • Types of validity:
    • Construct Validity: Validity determined by comparing to a "gold standard" measure (e.g., ACL rupture assessed via arthroscopic visualization).
      • Gold standards may not exist for many constructs, necessitating reference standards or hypothesis testing.
      • Hypothesis Testing: Predefined hypotheses about relationships between measures are tested post-data collection to support validity.
  • Statistics in Reliability and Validity

    • Reliability and validity assessments often involve agreement between scores:
    • For reliability: same measure taken twice.
    • For validity: different measures compared.
    • Types of statistical tests:
    • Dichotomous measures: kappa, sensitivity/specificity.
    • Continuous measures: intraclass correlation coefficients, correlations, limits of agreement, R².
  • Key Points

    • Reliability and validity are spectrums, not binary; they can be "more or less" reliable/valid.
    • No measure is perfectly reliable or valid; all measures have inherent variability.
    • Practical considerations impact measure choice:
    • Time to administer.
    • Patient's comprehension of the instructions.
    • Data storage and utilization.
    • Measurement is complex; scrutinize research for reliability and validity of measures.
    • Takeaway: Be cautious in interpreting data without knowledge of a measure's reliability and validity.