Reliability refers to the degree to which the same event produces consistent results. It is defined by:
Consistency
Repeatability
Dependability
Reproducibility
Systematic Errors (Predictable, consistent)
Examples: Improper use of landmarks, improperly calibrated tools
Random Errors (Unpredictable, inconsistent)
Examples: Examiner fatigue, environmental disruptions
Examiner (e.g., improper technique)
Examined (e.g., fatigue, cooperation)
Examination (e.g., poorly calibrated equipment)
Test-retest Reliability: Consistency of a test over time
Internal Consistency: Correlation among items within a test (Cronbach's alpha)
Intra-rater Reliability: Stability of one rater's measurements over time
Inter-rater Reliability: Agreement between two or more raters
Intraclass Correlation Coefficient (ICC): Ideal for continuous data; reflects both relationship and agreement
>0.90 = Excellent reliability
>0.75 = Good reliability
<0.75 = Poor to moderate reliability
Cronbach's alpha: Measures internal consistency
Kappa statistic: Used for categorical data
Bland-Altman plots: Visualizes agreement between methods
Validity refers to whether a measurement accurately reflects what it intends to measure.
Face Validity: Simple determination of whether the test appears effective
Content Validity: Extent to which a test covers all aspects of the concept
Construct Validity: Measures an abstract concept like intelligence or personality
Criterion-related Validity: Compares a test to an established gold standard
Concurrent Validity: Correlates with current performance
Predictive Validity: Predicts future outcomes
Responsiveness: Ability to detect meaningful changes over time
Correlation Coefficient (r)
r = 0.35+ is generally acceptable
Often converted to r² for better interpretation (percentage of variability explained)
Effect Size (ES): Measures change between groups or over time
Large Effect = 0.8
Moderate Effect = 0.5
Small Effect = 0.2
Reliability is necessary for validity, but reliability alone does not guarantee validity.
An instrument can be reliable (consistent) but still invalid (not accurate).
Use operational definitions for clarity and replication
Conduct proper training for examiners
Regularly inspect and calibrate equipment
Utilize blinded assessments to reduce bias
Separate observation from inference during evaluations
Demonstrated high ICC (0.97) indicating excellent reliability
Showed strong associations between baseline and follow-up scores, confirming both concurrent and predictive validity
Reliability ensures consistency; Validity ensures accuracy.
Employ structured methodologies to enhance both.
ICC is highly recommended for reliability in continuous data.
Responsiveness is critical for detecting real changes in clinical outcomes.