VALIDITY
VALIDITY
concerns what the test measures and how well it does so.
Once we have established that a test is reliable, we must show that it is also valid, that it measures what it is intended to measure.
not a matter of “is this test valid or not”
but is the test valid for this particular purpose, in this particular situation, with these particular subjects
Whether a test is or is not valid depends in part on the specific purpose for which it is used.
CONTENT VALIDITY
built into a test from the outset through the choice of appropriate items.
Messick (1989) considers content validity to have two aspects:
content representativeness
content relevance
The issue of content validity is one whose answer lies partly in expert skill and partly in individual preference.
Content Validation Form- used to ensure all contents are accurate and relevant
Taxonomies
Achieving content validity can be helped by having a careful plan of test construction, much like a blueprint is necessary to construct a house.
Such decisions might be based on the relative importance of each aspect, might reflect the judgment of experts, or might be a fairly subjective decision.
CRITERION VALIDITY
Criterion of intelligence
If a test is said to measure intelligence, we must show that scores on the test parallel or are highly correlated to intelligence as measured in some other way.
Criteria
Contrasted groups
Groups that differ significantly on the particular domain.
TYPES
1 Predictive Validity
extent to which a score on a scale or test predicts scores on some criterion measure in the future
assesses how well a test forecasts outcomes based on its relationship with future performance
2 Concurrent Validity
extent to which test scores correspond to a criterion measure taken at the same time
examines how well a test correlates with a well-established measure of the same construct.
Concurrent validation is employed merely as a substitute for predictive validation.
Criterion Contamination
an essential precaution in finding the validity of a test is:
to make certain that the test scores do not themselves influence any individual’s criterion status.
possible source of error in test validation
since the criterion ratings become “contaminated” by the rater’s knowledge of the test scores
CONSTRUCT VALIDITY
Constructs
broad categories derived from the common features shared by directly observable behavioral variables.
They are theoretical entities, not themselves directly observable.
Interest in constructs led to the introduction of Construct Validity.
Construct validity
an umbrella term that encompasses any information about a particular test
both content and criterion validity can be subsumed under this broad term
Content validity involves the extent to which items represent the content domain.
Criterion validity focuses on the difference between contrasted groups such as high and low performers.
5 Major Methods for Assessing Construct Validity
Cronbach and Meehl (1955)
Group differences
A statistical notion of correlation and its derivative of factor analysis
Internal consistency
Test-retest reliability, or more generally, studies of change over occasions
Studies of process
TYPES
1 Convergent validity
show that a particular test correlates highly with variables, which on the basis of theory, it ought to correlate with
2 Discriminant validity
should not correlate significantly with variables that it ought not to correlate with
OTHER ASPECTS
Face validity
refers to whether a test “looks like” it is measuring the pertinent variable
related to client rapport and cooperation, because ordinarily,
a test that looks valid will be considered by the client more appropriate and therefore taken more seriously than one that does not
Differential validity
studies sometimes obtain different results with the same test not necessarily because the test is invalid, but because there is differential validity in different populations.
Meta-analysis
consists of a number of statistical analyses designed to empirically assess the findings from various studies on the same topic.
Validity generalization
where correlation coefficients across studies are combined and statistically corrected for such aspects as unreliability, sampling error, and restriction in range