1/71
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Test Validity
The extent to which the test measure what it claims to measure
most fundamental consideration when evaluating psychological tests
A test may yield reliable scores _____
yet might not be a valid indicator of what it claims to measure
Validity is context-dependent
Test may be valid in one situation or population but not in other.
Validity is a unitary concept
Supported by multiple lines of evidence
Face Validity
Superficial appearance that a test measures what it claims to. Lacks evidence & is not a true form of validity.
Face Validity is still useful for _________
test taker motivation & acceptance when items appear relevant.
Content-Related Validity
Assesses how well test items represent the full scope of the construct or subject matter.
Tests have high content validity when ________
test items provide representative samples of all possible items in the relevant domain.
Primary applications of content-related validity
educational achievement tests, employment tests & medical testing
Content-related validity is judged ________
logically, not statistically, often w/ expert evaluation
Involves checking item wording, relevance & reading level
Construct underrepresentation
missing important content (geometry not in math test)
Threat to construct-related validity
Construct irrelevant variance
scores influenced by unrelated factors (test anxiety…)
Threat to construct-related validity
What is the primary evidence of validity in achievement testing?
Content validity
Scenario: How can a psychometric theory test demonstrate content validity?
Items are based on relevant textbook chapters
Material is adequately sampled
What is the first step in establishing content validity?
Clearly defining the content to be covered.
What tool is often used to establish content validity?
A table of specifications
What framework is often used in a table of specifications for achievement tests?
Bloom’s taxonomy for cognitive domains
What must be demonstrated when a test is used for hiring or promotion?
That test items are work-related
How is the content for employment tests defined?
Through job analysis by a panel of experts specifying required knowledge and skills.
What is commonly used to match test content with job specifications?
A percentage agreement figure.
What method is used to assess item relevance in employment tests?
Essential, Useful but not essential, Not necessary
How is the Content Validity Ratio (CVR) calculated?
Based on the number of panelists rating an item as “essential” (ne) and the total number of panelists (N).
What does a higher CVR indicate?
Greater consensus that an item is essential
What is the range of the Content Validity Ratio (CVR)?
From 0 (50% say essential) to 1.00 (100% say essential)
Criterion-Related Validity
Assesses how well test scores correlate with a specific external criterion.
test serves as a proxy for the actual behaviour or outcome we aim to predict.
Predictive Validity
Test scores predict future performance on a relevant criterion.
Predictive Validity may be more time consuming but ________
better reflects real-world applications.
Concurrent Validity
test scores are related to some criterion measure obtained at the same point in time (i.e., concurrently).
Concurrent validity is a special case of
predictive validity with a minimal time gap.
Types of statistical evidence for criterion-related validity?
Validity coefficient and decision theory/expectancy data.
What is a validity coefficient?
A correlation showing how well a test predicts or relates to a criterion.
Which correlation coefficients are typically used to express validity coefficients?
Pearson’s r or Spearman’s rho (for ordinal data).
What does a larger correlation between test scores and criterion scores indicate?
Greater criterion-related validity.
What is the common range for an adequate validity coefficient?
.30 to .40
How common are validity coefficients above .60?
They are rare.
Why is the Standard Error of Estimate (SEest) reported with the validity coefficient?
Because test scores are imperfect predictors of criterion scores
What does the Standard Error of Estimate (SEest) represent?
The margin of error in predicted criterion scores due to imperfect validity
What is the formula for SEest?
SEest = SDy × √(1 - rxy²)
In the SEest formula, what does SDy represent?
The standard deviation of criterion scores
In the SEest formula, what does rxy represent?
The validity coefficient
What does SEest reflect in regression analysis?
Error of prediction from the regression line
What does the Standard Error of Measurement (SEM) reflect?
The margin of error in an individual’s test score due to test unreliability
What is the formula for SEM?
SEM = SD × √(1 - rxx)
In the SEM formula, what does SD represent?
The standard deviation of test scores
In the SEM formula, what does rxx represent?
The reliability coefficient
Cronbach & Meehl (1955)
expanded concept of validity to include both practical & theoretical dimensions.
What is a key requirement for construct validity?
The test should behave as predicted by the theory behind the construct.
What does it indicate if a test performs as theory predicts?
It strengthens confidence in both the test and the underlying theory.
What does it suggest if a test does not behave as the theory predicts?
The problem may lie with the theory or the test’s validity.
Why is construct validity important for many psychological traits?
Because they lack objective criteria, making criterion validity impractical.
How is construct validity established when no adequate criterion exists?
By defining the construct and developing/testing appropriate measures.
How is construct validity typically built?
Through multiple studies showing consistent relationships with other measures.
What does construct validity refer to?
Any evidence showing that a test measures its intended construct.
encompasses all types of validity evidence, including content and criterion validity.
What is convergent evidence in construct validity?
When a test correlates well with other measures of the same construct.
What should valid tests show in terms of theory?
Expected theoretical relationships.
How is convergent evidence established?
Through multiple studies building a network of meaning around test scores.
What additional evidence is required to fully support construct validity?
Discriminant evidence.
What does discriminant evidence demonstrate?
That the test does not measure unrelated constructs, confirming its uniqueness.
What kind of correlations should a test have with unrelated constructs to show discriminant evidence?
Low correlations.
Why is discriminant evidence important?
It ensures the test measures something distinct and not redundant with other tests.
What does evaluating item or subtest homogeneity check for?
Whether the test measures a single construct.
How can developmental changes support construct validity?
If score changes across development align with theoretical expectations.
What does correlating test scores with related and unrelated measures assess?
Convergent and discriminant validity.
How can group differences support construct validity?
If score differences across groups match theoretical predictions.
What is the purpose of factor analysis in construct validity?
To examine the internal structure of the test.
Why analyze classification accuracy of test scores?
To see if scores allow proper identification or categorization of examinees.
How do intervention effects relate to construct validity?
If interventions produce expected score changes, this supports the test’s validity.
What limits the maximum possible validity of a test?
The square root of the product of the reliabilities of the two measures.
How strong can a test's correlation with another variable be?
No stronger than its correlation with itself (its reliability).
Who shares responsibility for test validation?
Both the test developer and the test user.
What is the test developer's responsibility in validation?
To provide evidence and rationale for the test's intended use.
What is the test user's responsibility in validation?
Evaluating the evidence in the particular setting in which the test is to be used