1/21
Definitions, facts, general knowledge
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Equal appearing interval method
A process to assess the construct validity of a test by:
having multiple, independent raters arrange items from least to most representative of the construct, according to Thurstone scales.
calculating the mean and standard deviation of Thurstone scale scorings by each rater.
Use multivariate statistics to compare means and standard deviations - if significant, construct validity cannot be validated.
Sample heterogeneity
Variability in the sample, increases external validity but reduces reliability.
Item characteristic curve
A graphical representation of the relationship between ability level and the probability of correctly answering an item on a test, used in item response theory to evaluate the properties of test items.
Item-total correlation
A Pearson r correlation between a typical test item’s ability to discriminate between high and low scoring examinees and the overall test’s ability to discriminate between high and low scoring examinees.
Floor effect
A situation where a substantial number of test-takers achieve the lowest possible score, limiting the measurement of variability in the lower range of ability.
Ceiling effect
A phenomenon where a test is too easy, resulting in a significant number of test-takers achieving high scores that don't reflect their actual abilities.
rtt
Test-retest reliability: proportion of variance attributable to the true score variance over repeated administrations of the same test.
Item discrimination index
A measure of how well a test item differentiates between test-takers who have a higher ability level and those who have a lower ability level. Higher number = better discrimination.
Computed as the difference between the proportions of high-scoring and low-scoring test-takers who answered the item correctly.
Uniform bias
A type of bias in test scores that affects individuals uniformly across different groups, leading to the same measurement error regardless of the test-taker's characteristics.
Identical slopes of ICCs across ability levels.
Scalar invariance
“Strong” invariance”, a factor solution being tested across different groups containing:
the same number of factors and principal components
the same simple structure standard pattern
the same factor loadings
the same average score (intercept) for any given factor between groups
Enables tests of correlations and factor-level differences between groups.
Temporal stability
Refers to the within-groups consistency of an examinee’s scores across multiple administrations.
Strict invariance
When a factor solution being tested across different groups contains:
the same number of factors and principal components
the same simple structure standard pattern
the same factor loadings
the same average score (intercept) for any given variable between groups
the same amount of error (uniqueness values) across groups.
Enables tests of correlations and differences at the item- and factor-level.
Differential reliability
The internal consistency (α) of a test's scores across different conditions or groups as a quantification of bias.
rxx
Reliability coefficient (generic): proportion of obtained score variance attributable to the true score variance, ranges from 0.00 - 1.00 and is ideally >0.80.
rxx must be >0.90 for clinical decisions.
Differential cost of errors
The difference in the impact or consequences of making Type I (false positive) versus Type II (false negative) errors in dichotomous testing conditions.
Criterion problem
Lack of reliability in the criterion results in a perceived lack of validity in a measure, suggesting the criterion itself is biased.
KR20
A specific measure of internal consistency (reliability) for dichotomous items, defined as the mean of all possible split-half reliability values of a test.
Predictive validity
The extent to which a test accurately forecasts outcomes or behaviors based on scores; the test criterion is based in the future.
Concurrent validity
the extent to which test scores correlate with scores from other established measures at the same time; the test criterion is based in the present.
Incremental validity
The extent to which different administrations of test scores over time correlate with one another as incremental changes are made; the criterion is continuously being implemented.
1 - rxx
Error: proportion of obtained score variance not attributable to the true score variance, ranges from 0.00 - 1.00 and is ideally <0.20.
Oblique
Referring to rotation in an exploratory factor analysis, when the axes being rotated are not kept at 90°, resulting in factors that are correlated with one another.