Validity

Introduction to Validity

Validity is defined as the degree to which a test measures what it claims to measure (Brown).
For instance, a reading comprehension test should measure reading skills, not grammatical ability.

Types of Validity

1. Face Validity

Though not extensively discussed, face validity refers to how at face value, a test appears to measure what it's supposed to.

2. Criterion-Oriented Validity

Also known as criterion-related validity comprising comparison to external measures (e.g., TOEFL, IELTS).
Types:
- Concurrent Validity: Compares results of the test with the repetition of the same test or another measure administered simultaneously.
- Predictive Validity: Measures how well a test predicts future performance (e.g., TOEFL scores predicting university performance).

3. Content Validity

Ensures a match between the test content and the curriculum taught.
Example: If a test covers grammar topics, it should encompass all relevant structures taught in the course, ensuring a comprehensive representation.

4. Construct Validity

Considered the most critical type of validity, with a focus on the actual concept the test is meant to measure.
Addressed through various quizzes and questions to assess understanding of concepts related to validity.

Limitations of Traditional Validity Types

Challenges with criterion validity:
- Issues arise when comparing tests, as the validity of the criterion test itself may be questionable (e.g., TOEFL).
Challenges with content validity:
- Subjectivity in judging a test’s content validity; experts determining validity can lead to biases.

Threats to Validity

1. Construct Irrelevant Variance

Example scenario: A geography reading comprehension test may yield high scores due to geographical knowledge rather than language proficiency.

2. Construct Underrepresentation

Example scenario: A grammar test that only includes certain tenses may not fully represent the grammar construct in a meaningful way.

MISIG's Unified Validity Framework

Validity is viewed as an evaluative judgment of how well test scores relate to inferences and actions based on those scores.
Emphasizes the judgments made, not just the scores themselves.

Types of Evidence for Validity

Convergent Evidence

Supports the claim that the test measures what it purports to (e.g., correlations with other tests aimed at similar constructs).

Discriminant Evidence

Evidence that shows the test does not measure aspects it is not supposed to (e.g., a grammar test should not assess knowledge of geography).

Implications of Test Usage

Consequences of tests must be considered, such as how test results affect students’ futures and decisions made based on those scores.
Example: How a test being too difficult may impact students negatively.

Construct Validity as a Unified Theme

Includes aspects of content and criterion-related validity.
The importance of defining constructs clearly to ensure proper assessment measures alignment with intended goals.

Conclusion

Validity covers comprehensive aspects of assessment intending to ensure tests accurately measure what they are designed to evaluate, supported by various sources of evidence and evaluation methods.