Chapter 5: Reliability

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/36

Earn XP

Description and Tags

A comprehensive vocabulary review of psychometric reliability concepts, measurement error types, reliability estimation methods, and statistical measures of error precision based on Chapter 5 lecture notes.

Last updated 1:48 PM on 6/13/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai	Chat

No analytics yet

Send a link to your students to track their progress

37 Terms

New cards

Reliability

In the psychometric sense, this refers to consistency in measurement; specifically, how consistently and accurately a psychological test measures what it purports to measure.

New cards

Classical Test Theory (CTT) Model

The framework assuming that an individual's score on a test is composed of a true component and an error component, represented by the formula: $X = T + E$ .

New cards

Observed Score ( $X$ )

The actual, raw score earned by a testtaker on a given instrument, such as getting $45$ out of $50$ questions correct.

New cards

True Score ( $T$ )

A theoretical value representing the actual, genuine amount of an attribute possessed by the testtaker, completely free of any measurement error.

New cards

Error Score ( $E$ )

The component of the observed score attributed to irrelevant, random, or extraneous factors that have nothing to do with the actual construct being measured.

New cards

Reliability Coefficient

The proportion of total variance in test scores that is attributed to true variance, typically ranging from $0$ to $1.00$ .

New cards

Variance ( $\text{s}^2$ )

The standard deviation squared, serving as a crucial index of test score variability and describing how much individual scores spread out from the arithmetic mean.

New cards

True Variance ( $\text{s}^2_{tr}$ )

Variations in test scores resulting from real, authentic, and genuine differences among testtakers regarding the attribute or construct being measured.

New cards

Error Variance ( $\text{s}^2_{e}$ )

Variations in test scores resulting from irrelevant, chance, or random sources that contaminate the measurement process, represented by the formula $s^2 = s^2_{tr} + s^2_e$ .

New cards

Random Error

Error caused by unpredictable, transient fluctuations, such as sudden external noise or a temporary drop in attention, that affect testtakers uniquely and unsystematically.

New cards

Systematic Error

A source of error that is predictable, constant, and fixed, affecting all scores uniformly and thus not changing the variability or reliability coefficient.

New cards

Item Sampling / Content Sampling

The variation in scores occurring because of the specific items chosen for inclusion in a test compared to the entire universe or domain of potential content.

New cards

Test-Retest Reliability

An estimate obtained by administering the exact same measurement instrument to the same sample of individuals at two distinct points in time.

New cards

Coefficient of Stability

The correlation coefficient obtained when the time interval between two test-retest administrations is greater than $6$ months.

New cards

Parallel Forms

Versions of a test where the operational means and variances of observed test scores are theoretically identical.

New cards

Alternate Forms

Different versions of a test designed to be equivalent in content coverage and difficulty but containing entirely distinct, non-overlapping items.

New cards

Coefficient of Equivalence

The correlation between the scores on two forms of a test, reflecting how equivalent the two item samples are.

New cards

Split-Half Reliability

An internal consistency estimate obtained by administering a test once and splitting the items into two equal halves to calculate a correlation coefficient.

New cards

Odd-Even Reliability

A method of creating test halves by assigning odd-numbered items to one half and even-numbered items to the other.

New cards

Spearman-Brown Formula

A formula used to estimate the internal consistency reliability of a lengthened or shortened test: $r_{SB} = \frac{nr_{xy}}{1 + (n - 1)r_{xy}}$ .

New cards

Inter-Item Consistency

The degree of correlation and consistency among all individual items on a scale, requiring only a single administration.

New cards

Homogeneity

The degree to which individual items on a test measure a single, unifactorial trait or construct, resulting in items that are tightly inter-correlated.

New cards

Heterogeneity

Describes a multi-construct test or test battery where different subscales deliberately measure completely different, independent traits.

New cards

Kuder-Richardson Reliability (KR-20)

A formula (r_{kr-20} = (\frac{k}{k-1})(1 - \frac{\text{∑}pq}{\text{σ}^2})) used for highly homogeneous tests with strictly dichotomous scoring (right/wrong).

New cards

Coefficient Alpha ( $\text{α}$ )

The mean of all possible split-half correlations corrected by the Spearman-Brown formula, designed for non-dichotomous items like Likert scales.

New cards

Redundancy Myth

The misconception that a Coefficient Alpha above $.90$ is always better, when it actually indicates unnecessary, repetitive items asking the same narrow question.

New cards

Inter-Scorer Reliability

The degree of consistency, consensus, and agreement between two or more independent scorers, raters, or observers.

New cards

Dynamic Characteristics

Psychological traits, states, or processes that are fluid and shift rapidly in response to situational factors, such as state anxiety.

New cards

Static Characteristics

Psychological traits that are deeply embedded, highly durable, and do not fluctuate rapidly over time, such as core personality traits.

New cards

Restricted Variance

Occurs when a sample is highly uniform, creating a narrow range of scores that mathematically suppresses the correlation coefficient and deflates test reliability.

New cards

Inflated Variance

Occurs when a sample is highly diverse, creating an exceptionally wide range of scores that artificially boosts the reliability index.

New cards

Power Test

A test containing items arranged in increasing difficulty with generous time limits so testtakers can attempt every item.

New cards

Speed Test

A test containing uniform, easy items with a strict time limit that makes it impossible for any testtaker to finish.

New cards

Criterion-Referenced Test

A test where performance is compared directly against an absolute, pre-established standard or mastery level rather than a normative peer group.

New cards

Standard Error of Measurement (SEM)

A diagnostic tool representing the standard deviation of a theoretically normal distribution of test scores for one individual: $\text{σ}_{meas} = \text{σ}\text{√}(1 - r_{xx})$ .

New cards

Confidence Interval

A precise band or range of scores, calculated using the SEM, that is statistically likely to contain the testtaker's true psychological score.

New cards

Standard Error of the Difference ( $\text{σ}_{diff}$ )

A statistical measure used to evaluate the true difference between two distinct scores to determine if the difference is statistically significant.