Validity

Introduction to Validity

  • Validity is defined as the degree to which a test measures what it claims to measure (Brown).

  • For instance, a reading comprehension test should measure reading skills, not grammatical ability.

Types of Validity

1. Face Validity

  • Though not extensively discussed, face validity refers to how at face value, a test appears to measure what it's supposed to.

2. Criterion-Oriented Validity

  • Also known as criterion-related validity comprising comparison to external measures (e.g., TOEFL, IELTS).

  • Types:

    • Concurrent Validity: Compares results of the test with the repetition of the same test or another measure administered simultaneously.

    • Predictive Validity: Measures how well a test predicts future performance (e.g., TOEFL scores predicting university performance).

3. Content Validity

  • Ensures a match between the test content and the curriculum taught.

  • Example: If a test covers grammar topics, it should encompass all relevant structures taught in the course, ensuring a comprehensive representation.

4. Construct Validity

  • Considered the most critical type of validity, with a focus on the actual concept the test is meant to measure.

  • Addressed through various quizzes and questions to assess understanding of concepts related to validity.

Limitations of Traditional Validity Types

  • Challenges with criterion validity:

    • Issues arise when comparing tests, as the validity of the criterion test itself may be questionable (e.g., TOEFL).

  • Challenges with content validity:

    • Subjectivity in judging a test’s content validity; experts determining validity can lead to biases.

Threats to Validity

1. Construct Irrelevant Variance

  • Example scenario: A geography reading comprehension test may yield high scores due to geographical knowledge rather than language proficiency.

2. Construct Underrepresentation

  • Example scenario: A grammar test that only includes certain tenses may not fully represent the grammar construct in a meaningful way.

MISIG's Unified Validity Framework

  • Validity is viewed as an evaluative judgment of how well test scores relate to inferences and actions based on those scores.

  • Emphasizes the judgments made, not just the scores themselves.

Types of Evidence for Validity

Convergent Evidence

  • Supports the claim that the test measures what it purports to (e.g., correlations with other tests aimed at similar constructs).

Discriminant Evidence

  • Evidence that shows the test does not measure aspects it is not supposed to (e.g., a grammar test should not assess knowledge of geography).

Implications of Test Usage

  • Consequences of tests must be considered, such as how test results affect students’ futures and decisions made based on those scores.

  • Example: How a test being too difficult may impact students negatively.

Construct Validity as a Unified Theme

  • Includes aspects of content and criterion-related validity.

  • The importance of defining constructs clearly to ensure proper assessment measures alignment with intended goals.

Conclusion

  • Validity covers comprehensive aspects of assessment intending to ensure tests accurately measure what they are designed to evaluate, supported by various sources of evidence and evaluation methods.