6. Messick - Validity of Psychological Assessment Study Notes

Definition and Importance
- The traditional view of validity categorizes it into three types: content, criterion, and construct validity.
- This categorization is considered incomplete as it overlooks the social value implications and consequences of score use.

Unified Validity: A new comprehensive concept that integrates content, criteria, and consequences within a construct validity framework.
Addresses score meaning and social values in test interpretation and use.

Validity is characterized as an evaluative judgment regarding how well empirical evidence supports the interpretation and actions based on test scores. It is defined not as a property of the test itself but as a feature of the meaning derived from the test scores.

Validity applies beyond mere test scores; it encompasses any consistent behaviors or attributes observed across various assessment forms, including:
- Test scores
- Clinical appraisals
- Behavioral ratings
Valid interpretations necessitate that the score's meaning is sufficiently validated.

Construct Underrepresentation: Occurs when assessments fail to cover significant dimensions of the construct.
Construct-Irrelevant Variance: Includes excess reliable variance related to other constructs affecting score interpretation.
- Types of variance:
- Construct-Irrelevant Difficulty: Extraneous factors make a task unduly difficult for some respondents (e.g., reading demands skewing knowledge assessment).
- Construct-Irrelevant Easiness: Certain respondents better perform due to unrelated task familiarity or item clues, inflating their scores.
These threats can create biases and unfairness in testing outcomes.

Evidence supporting construct validity includes:
- Internal and external test structure assessments.
- Theoretical predictions about score changes over time or across different groups.
- Process inquiries that analyze the cognitive dimensions underlying test performance.
Stronger evidence can enhance understanding of score interpretation.

Definition: Theoretical rationales for observed consistencies during assessment tasks.
Demonstrated through empirical engagement evidence, such as think-aloud protocols or response time patterns.

Definition: The conformity of scoring models to the structural nature of the construct domain.
Achieving structural fidelity is crucial for valid scoring systems.

Examines the broad applicability of score interpretations across various settings, populations, and occasions.
Addresses the conflict between depth of task examination and breadth of domain coverage in performance assessments.

Validates how well assessment scores correlate with external measures as expected by construct theory.
Both convergent and discriminant patterns confirm the construct's validity and help isolate it from alternative explanations.

### 6. Consequential Aspect
- Evaluates the social implications of test use: positive or negative impacts resulting from score interpretations.
- Genrating awareness of whether scores reflect assessment validity or any sources of invalidity.

Highlighting that validity judgments are inherently values judgments.
Details how these values affect test interpretation and application, promoting awareness of both intended and unintended consequences.

Validity is a unified concept, integrating the evidential basis for interpretation and the consequence basis for social impact through action implications. It provides a comprehensive understanding of how assessments can be trusted and utilized in educational and psychological settings.