1/14
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Norm-Referenced Tests
Compares an individual’s performance to a larger, representative group to determine relative standing.
Outcome: percentile rank or comparison to the normal distribution.
Examples: standardized tests like the SAT, GRE, or ACT.
Strengths: good for high-stakes decisions that require ranking, such as college decisions
Limitations: a person’s score can change based on the performance of the norm group, even if their answers don’t change.
Criterion-Referenced Tests
Measure an individual’s performance against a fixed set of standards or criteria to determine their personal mastery of specific skills or knowledge.
Outcome: a score indicating mastery (e.g., whether they passed or failed)
Examples: class quizzes, driving test, competency exams
Strengths: good for determining specific skills that have been mastered, which can inform targeted instruction
Limitations: Does not provide information about how the individual compares to others.
Reliability
Consistency of scores
Forms of Reliability
Test-retest reliability
Alternative-forms reliability
Splitting the test into halves, and they should have similar results because they are testing for the same thing.
Inter-rater reliability
Two researchers give the same test to the same client
Internal consistency reliability
Have items in the same test correlate to each other
Reliability Coefficient
The degree of consistency in the measurement of test scores.
.00 to .59 = very low
.60 to .69 = low
.70 to .79 = moderate
.80 to .89 = good
.90 to .99 = excellent
Validity
The degree to which evidence and theory support the interpretations of test scores entailed by the proposed uses of a test.
1. Do the test scores measure what it is supposed to measure?
2. Is there evidence to support the way that the test scores are being used?
Forms of Construct Validity
Internal structure
Associations with other variables
Consequences of use
Test content
Response processes
Internal Structure
Do the patterns of correlation fall as expected based on theory
Measurement invariance/equivalence (MI/ME)
Measurement Invariance
The property of a scale or measure that indicates it produces the same results across different groups or conditions, meaning the measure is working the same way for everyone.
Association with Other Variables
Convergent and Divergent Validity
Positively related (convergent evidence)
Negatively related (convergent evidence)
Unrelated to (divergent/discriminant evidence)
Concurrent validity (with similar measures)
Predictive validity
Sensitivity
The test’s ability to correctly identify people with a disease
Specificity
The test’s ability to correctly identify people without the disease
Sensitivity and Specificity
Applies to a screening test’s attribute relative to a reference standard
Interpretation
Sensitivity + specificity ≥ 1.5
Sensitivity ≥ .80; specificity ≥ .70
Example:
Sensitivity = 0.8 (true positives = 80%; false negatives = 20%)
Specificity = 0.8 (true negatives = 80%; false positives = 20%)
Types of Validity
Content validity
Does the content cover everything needed
Criterion-related validity
Correlation between different measures
Construct validity
Does it measure the theoretical framework that it is based on?
Can We Have Validity Without Reliability
No, if it measures something accurately, it should also measure it consistently.