1/33
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
3 test construction strategies
1)correspondence (content/rational)
2)empirical (criterion-group)
3)construct (theoretical)
correspondence strategy (hist context) - 3
dominant in early 20th century
Used primarily in self-report measures, but also applied to interviews and behavioral tests.
Over time, it became less dominant as limitations became clear.
correspondence strategy (core assumptions - 2)
1)correspondence strategy as intuitive and atheoretical approach: Items were created based on face-value relevance to the construct (e.g., extraversion, anxiety).
Ex. Test about s/o experiencing sleep problems, test ask; 'do you experience sleep problems' and their response indicator of amount of sleep problems person experiencing
2)One-to-one correspondence assumption: A person’s verbal report was taken as a direct reflection of their internal state.
Example: If someone endorses “I often feel anxious when speaking in public,” it was assumed to directly measure their anxiety level.
4 key assumptions of correspondence strategy
1) Direct mapping of items to constructs: Each test item corresponds to a specific psychological construct.
No complex interpretation needed—what you ask is exactly what you measure.
2)Shared meaning of items
All test-takers interpret the item in the same way. (item has common meaning for all test takers)
Test developers and test-takers share the same understanding of the question.
3)Accurate self-assessment: Test-takers are capable of reflecting on their internal states accurately.
They can recognize and report their own experiences (e.g., anxiety, extraversion).
4)Honest responding
Test-takers will truthfully report their experiences.
Assumes no distortion from social desirability, impression management, or other biases.
3 limitations of correspondence strategy
Response bias: People may underreport or overreport due to social desirability or self-presentation concerns.
Interpretation differences: Not all test-takers interpret items the same way (cultural, linguistic, or personal differences).
Over-simplification: Ignores the complexity of psychological processes—assumes self-report is always accurate.
can the assumptions be met in correspondence strategy? - 3
Can these assumptions realistically be met?
In practice, no
—because people vary in honesty, self-awareness, and interpretation.
This realization led to the development of more sophisticated test construction strategies (e.g., empirical and factor-analytic approaches).
empirical stategy (hist context) 2
merged in the 1940s, gaining traction after the limitations of the correspondence approach became clear.
Most famously applied in the Minnesota Multiphasic Personality Inventory (MMPI), which remains widely used today.
empirical stategy (core principles) - 3
Item selection is based on ability to discriminate betwn 2 groups (data, not theory. systematic item selection
Items are chosen because they statistically differentiate between groups (e.g., depressed vs. non-depressed *one group endorses items more than other group).
The actual content of the item is less important than whether it separates groups reliably.
External criteria drive selection.
Known groups (clinical vs. non-clinical) are compared.
Items endorsed more by one group than the other are retained.
Items endorsed equally by both groups are discarded.
Meaning is not tied to verbal content of item (aka literal wording of item)
Example: Endorsing “I avoid stepping on sidewalk cracks” doesn’t literally mean the person avoids cracks.
Instead, the meaning comes from the pattern of endorsement by the group (e.g., if people with OCD endorse it more often, it becomes an OCD-related item).
Interpretation of Scores for empirical stategy
interpretation of scores by cookbook: empirically known correlates of high and low scores
“Cookbook” interpretation
Once scales are built, interpretation relies on empirical correlations.
A person’s score is compared to established norms and known correlates of high/low scores.
There is no deeper theoretical meaning behind why a particular item is included.
Ex: A high score on the Psychasthenia (Pt) scale of the MMPI is empirically linked to anxiety/OCD symptoms, even if the items themselves don’t explicitly mention anxiety.
advantages of empirical stategy
Objective and data-driven: Reduces reliance on test developers’ assumptions or biases.
Practical utility: Produces scales that can reliably distinguish between groups.
Durability: The MMPI, built on this strategy, is still in use decades later.
4 Concerns with the Empirical Strategy
1)The groups compared must truly differ on the construct.
2)Other unintended differences can contaminate results.
3)Findings may not generalize beyond the original sample.
4)Items may overlap, reducing the richness of the scale.
1. Are contrasting groups really different as intended?
The empirical method depends on comparing two or more groups (e.g., depressed vs. non-depressed).
If the groups are not truly distinct on the construct of interest, the items selected may not actually measure what they’re supposed to.
Example: If the “depressed” group includes people with mixed conditions, the items may reflect unrelated traits instead of depression.
2. Unintended group differences
Groups may differ on variables other than the target construct (e.g., age, education, culture).
Items may end up reflecting these unintended differences rather than the intended psychological trait.
Example: If one group is older, items about physical health might differentiate groups—but not because of depression, rather because of age.
3. Problem of generalization
Items that differentiate one set of groups may not work the same way in other populations.
What “works” empirically in one sample may not generalize across cultures, demographics, or contexts.
This limits the external validity of the test.
4. Item overlap
Items selected empirically may overlap in what they measure, leading to redundancy.
This can inflate reliability artificially without adding new information.
Example: Multiple items may all indirectly tap into fatigue, so the scale looks consistent but is narrow in scope.
construct (theortecial strategy)
Construct (Theoretical) Strategy (hist) - 2
Origin: Developed in the 1950s; widely used from the 1960s to today.
Focus: Measuring hypothetical constructs (e.g., sociability, extroversion).
Construct (Theoretical) Strategy Approach (3)
Define the construct theoretically.
Develop test items that cover the full domain of the construct.
Combine items into a score representing the construct.
Core Assumptions of construct strategy
Universality: Everyone possesses some degree of the construct.
Behavioral Indicators: Non-test behaviors can serve as external referents (e.g., extroverts join clubs, go out with friends).
Convergent Validity: Test scores should correlate with other established measures of the same construct.
Discriminant Validity: Test scores should not correlate strongly with measures of unrelated constructs (e.g., extroversion ≠ dominance or anxiety).
Differentiation: A valid test should distinguish between groups:
High vs. low levels of the construct
Moderate vs. extreme levels
Related but distinct groups (e.g., extroverts vs. introverts, socially anxious, socially avoidant, socially withdrawn, etc.)
Validation & Evidence (3) of construct strategy
Reliability: Internal consistency and stability of the test.
Construct Validity:
Evidence that the test measures the intended construct.
Evidence that the test does not measure unrelated constructs.
Nomological Net: The test should fit within the broader theoretical framework of how the construct relates to other attributes.
*implications of construct strategy
Validity evidence must be both empirical (statistical correlations, reliability) and logical (theoretical justification).
A test is considered adequate if:
It is reliable.
It demonstrates construct validity.
It aligns with theoretical expectations in the nomological net.
(typical) steps in theoretical scale construction approach (9)
phase 1—item development: step 1)identification of domain and item generation
step 2)content validity
phase 2—scale development
step 3)pre-testing questions
step 4)survey administration
step 5)item reduction
step 6)extraction of factors (factor analysis)
phase 3—scale evaluation
step 7)test of dimensionality
step 8)test of reliability
step 9)tests of validity
domain identification (step 1 pt 1)
item generation (step 1 pt 2)
tips for item development
step 2)content validity
step 3 pre-testing items
cognitive interviews
advantages of cognitive interviews
step 5)item reduction
step 6)extraction of factors (factor analysis)
step 7)test of dimensionality
step 8)test of reliability
step 9)tests of validity