Psychometrics - Semantic 4

0.0(0)

Studied by 4 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/21

Earn XP

Description and Tags

Definitions, facts, general knowledge

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

22 Terms

New cards

Equal appearing interval method

A process to assess the construct validity of a test by:

having multiple, independent raters arrange items from least to most representative of the construct, according to Thurstone scales.
calculating the mean and standard deviation of Thurstone scale scorings by each rater.
Use multivariate statistics to compare means and standard deviations - if significant, construct validity cannot be validated.

New cards

Sample heterogeneity

Variability in the sample, increases external validity but reduces reliability.

New cards

Item characteristic curve

A graphical representation of the relationship between ability level and the probability of correctly answering an item on a test, used in item response theory to evaluate the properties of test items.

New cards

Item-total correlation

A Pearson r correlation between a typical test item’s ability to discriminate between high and low scoring examinees and the overall test’s ability to discriminate between high and low scoring examinees.

New cards

Floor effect

A situation where a substantial number of test-takers achieve the lowest possible score, limiting the measurement of variability in the lower range of ability.

New cards

Ceiling effect

A phenomenon where a test is too easy, resulting in a significant number of test-takers achieving high scores that don't reflect their actual abilities.

New cards

r_tt

Test-retest reliability: proportion of variance attributable to the true score variance over repeated administrations of the same test.

New cards

Item discrimination index

A measure of how well a test item differentiates between test-takers who have a higher ability level and those who have a lower ability level. Higher number = better discrimination.

Computed as the difference between the proportions of high-scoring and low-scoring test-takers who answered the item correctly.

New cards

Uniform bias

A type of bias in test scores that affects individuals uniformly across different groups, leading to the same measurement error regardless of the test-taker's characteristics.

Identical slopes of ICCs across ability levels.

New cards

Scalar invariance

“Strong” invariance”, a factor solution being tested across different groups containing:

the same number of factors and principal components
the same simple structure standard pattern
the same factor loadings
the same average score (intercept) for any given factor between groups

Enables tests of correlations and factor-level differences between groups.

New cards

Temporal stability

Refers to the within-groups consistency of an examinee’s scores across multiple administrations.

New cards

Strict invariance

When a factor solution being tested across different groups contains:

the same number of factors and principal components
the same simple structure standard pattern
the same factor loadings
the same average score (intercept) for any given variable between groups
the same amount of error (uniqueness values) across groups.

Enables tests of correlations and differences at the item- and factor-level.

New cards

Differential reliability

The internal consistency (α) of a test's scores across different conditions or groups as a quantification of bias.

New cards

r_xx

Reliability coefficient (generic): proportion of obtained score variance attributable to the true score variance, ranges from 0.00 - 1.00 and is ideally >0.80.

r_xx must be >0.90 for clinical decisions.

New cards

Differential cost of errors

The difference in the impact or consequences of making Type I (false positive) versus Type II (false negative) errors in dichotomous testing conditions.

New cards

Criterion problem

Lack of reliability in the criterion results in a perceived lack of validity in a measure, suggesting the criterion itself is biased.

New cards

KR20

A specific measure of internal consistency (reliability) for dichotomous items, defined as the mean of all possible split-half reliability values of a test.

New cards

Predictive validity

The extent to which a test accurately forecasts outcomes or behaviors based on scores; the test criterion is based in the future.

New cards

Concurrent validity

the extent to which test scores correlate with scores from other established measures at the same time; the test criterion is based in the present.

New cards

Incremental validity

The extent to which different administrations of test scores over time correlate with one another as incremental changes are made; the criterion is continuously being implemented.

New cards

1 - r_xx

Error: proportion of obtained score variance not attributable to the true score variance, ranges from 0.00 - 1.00 and is ideally <0.20.

New cards

Oblique

Referring to rotation in an exploratory factor analysis, when the axes being rotated are not kept at 90°, resulting in factors that are correlated with one another.