Psychological Assessment: Reliability, Validity, and Utility
Reliability
Definition: Reliability refers to the extent to which measurements in research are consistent and repeatable.
- Key Quotes:
- (Nunnally): Research requires dependable measurement.
- (Gay): Any random influences causing measurement variation are sources of measurement error.
Random Errors vs Systematic Errors:
- Random Errors: Affect reliability.
- Systematic Errors: Affect validity.
Types of Reliability:
- Test-Retest Reliability:
- Indicates consistency of scores over time.
- Issues: Memory effects, maturation, and learning can influence scores across sessions.
- Equivalent-Forms Reliability:
- Compares scores from two different tests that measure the same construct.
- The two tests must be constructed carefully to ensure they are equivalent.
- The coefficient obtained is the coefficient of stability or equivalence.
- Split-Half Reliability:
- Assesses reliability within a single test administration, particularly for long tests.
- The test is split into two halves (often odd-even) to gauge consistency.
- Requires a correction formula, usually the Spearman-Brown prophecy formula.
- Rationale Equivalence Reliability:
- Estimates internal consistency by looking at item interrelationships rather than correlations.
- Internal Consistency Reliability:
- Measures how all items on a test correlate with one another.
- Kuder-Richardson: An estimate similar to average split-half reliability for tests without correct answers.
- Cronbach's Alpha: Widely used method for estimating reliability of continuous or Likert-scale responses.
- Standard Error of Measurement:
- Expresses reliability in terms of how often errors of a certain size occur.
Validity
- Definition: Validity refers to the extent to which a test measures what it is intended to measure.
- Validity assessments depend on context: the test form, purpose, and target population.
- Types of Validity:
- Content Validity:
- Ensures all content areas of a construct are represented in a test.
- Not empirically derived; assessed logically.
- Example: A geography test with most questions focusing on New England is not content-valid for American geography.
- Face Validity:
- Refers to how the test appears to measure what it claims.
- Criterion-Oriented/ Predictive Validity:
- Evaluates how well current test scores predict future performance.
- Correlates current test scores with future criteria.
- Concurrent Validity:
- Measures how test scores correspond to another established test administered simultaneously.
- Example: Assessing a new test against an older, established one at the same time.
- Construct Validity:
- Evaluates the degree a test measures a theoretical construct.
- Often involves experiments to correlate test scores with behaviors linked to the construct.
- Example: Validating an anxiety measure where anxiety is expected to increase under stress conditions.
Utility
- Definition: In psychometrics, utility refers to the useful practical value of a test in decision-making.
- Factors Contributing to Utility:
- Cost efficiency.
- Time savings.
- Comparative utility: A measure of one test’s usefulness relative to another.
- Clinical utility: Utility for diagnostic assessment/treatment purposes.
- Diagnostic utility: Classification effectiveness relative to other tests.
- Psychometric Soundness:
- A test is considered psychometrically sound if its reliability and validity coefficients are acceptably high.
- Utility indices reflect how well test scores facilitate better decision-making, particularly regarding cost-effectiveness in outcomes.