Measurement: Reliability & Validity
MEASUREMENT: RELIABILITY & VALIDITY
RELIABILITY
Refers to the consistency of the test or measure.
VALIDITY
Refers to the accuracy of the test or measure.
Interrelationship
Reliable But Not Valid: A measure could yield consistent results but not accurately measure what it intends.
Low Reliability And Low Validity: Measures that are not consistent and do not accurately measure the intended construct.
Reliable And Valid: Strength in both consistency and accuracy of what is being measured.
BREAKING DOWN RELIABILITY
Types of Reliability
Test Stability
Determines whether the measure is stable over time.
Internal Consistency
Examines how well the items on the measure work together to produce similar scores.
Inter-Rater Reliability
Assesses how consistent raters evaluate or judge.
Impacted by the number of judges involved.
Ensuring raters or observers have proper training is essential for consistent evaluation of the dependent variable (DV).
Split-Half Reliability (Internal Consistency)
Measures the degree to which randomly divided items from the same test correlate with one another.
Test-Retest Reliability (Stability Over Time)
Determines the degree to which the same test correlates with the same sample on two different occasions.
Correlation Types
Internal Consistency: Examines the consistency of results across items within the same test, indicating how well the items measure the same construct.
Inter-Rater Reliability: Assesses the degree to which different raters or observers give consistent estimates of the same phenomenon.
Test-Retest Reliability: Measures the stability of test scores over time by administering the same test to the same subjects at two different points in time, ensuring that the results remain consistent.
No Correlation
Negative Correlation
Positive Correlation
Split-Half (Internal Consistency)
Cronbach’s Alpha
A measure of internal consistency.
Vulnerable to missing data.
Must be interpreted in relation to the number of items included.
Cronbach's Alpha Interpretation:
WAYS TO INCREASE RELIABILITY
Ensure questions are easily understood
Have a sufficient number of questions
Increase your sample size
Ensure proper training among raters
VALIDITY
Types of Validity
Face Validity
Assessment based on whether, "on its face," the items seem like a good translation of the construct.
Content Validity
Addresses how well test questions match the content or subject area they are intended to assess.
Typically judged by experts in a given performance domain.
Effective content validity assumes a well-detailed description of the content domain, which may not always be true.
Criterion-Related Validity
Predictive Validity
Indicates how well a certain measure can predict future behavior or performance.
Convergent Validity
Measures the degree to which a construct measure "converges" with other measures that should assess the same concept.
Discriminant Validity
Indicates the degree to which a construct measure "diverges" from other measures that claim to assess different constructs.