Measurement: Instrument Validity & Reliability

MEASUREMENT: INSTRUMENT VALIDITY & RELIABILITY

PUBH 475

Definition:
- Refers to:
- Research design (internal validity)
- Ability to generalize based on sampling (external validity)
- Accuracy of an instrument in measuring what it is supposed to measure

Definition:
- Refers to:
- Consistency or stability of an instrument in measuring whatever it measures.

Definition:
- Degree to which an instrument measures what it is supposed to measure
Focus:
- Accuracy
Methodologies:
- Process involves qualitative and quantitative methodologies

Definition:
- Extent to which an instrument or test measures the construct it is supposed to measure
Relevance:
- Most relevant of the validities, testing the theoretical framework
Testing:
- Tested rigorously on convergence and discrimination

Definition:
- Extent to which a new instrument or test correlates with an established test (standard or “criterion”) examining a similar theoretical construct
Testing:
- Tested rigorously on its concurrent and predictive abilities

Definition:
- Extent to which an instrument or test adequately samples or captures relevant material of the construct
Testing:
- Tested rigorously to see if items are redundant or unnecessary
- Utilizes quantitative and qualitative methods to establish
Example:
- Developing a “stress” measurement instrument

Definition:
- Extent to which an instrument or test will be easy for study participants to understand and complete correctly
Assessment:
- Determined by experts and users
Example:
- Depression measurement instrument

Construct validity
- Is the construct accurately measured by the measurement instrument?
Criterion-related validity
- Do the measurement instrument scores correlate with scores on some concrete standard or criterion in the real world?
Content validity
- Does the measurement instrument assess the full range of relevant phenomena?
Face validity
- Does the measurement instrument appear to cover the phenomenon and can be easily understood?

Definition:
- Extent to which an instrument or test will produce the same or nearly the same result each time it is used
Focus:
- Stability or consistency over time
Interpretation:
- Less variation = more reliable

Parallel Forms Reliability
- Extent to which different forms of the same instrument or test produce the same result
- Example:
  - Different versions of standardized tests
Internal Consistency Reliability
- Extent of intercorrelations between/among items of the same construct
- Example:
  - Perceived Stress Scale (PSS): 10 items should “hang” together, each measuring a different part of stress
- Statistical Measurement:
  - Measured (e.g., Cronbach’s alpha, $\alpha$) where a higher number implies greater internal consistency

Test-Retest Reliability
- Extent of stability of the same measurement over time
- Example:
  - PSS administered twice, one month apart
Rater Reliability
- Extent of consistent measurement or rating within the rater (intrarater) or between raters (interrater)
- Example:
  - Multiple trainers administering physical activity intervention
- Statistical Measurement:
  - Measured (e.g., Kappa coefficient, $\kappa$) where a higher number indicates greater rater reliability

Reliable and Not Valid:
- Low Validity, Low Reliability: Not Reliable, Not Valid
- Both Reliable and Valid