Reliability in Psychological Assessment

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/27

flashcard set

Earn XP

Description and Tags

Flashcards covering key concepts of reliability in psychological assessment, including types of error, reliability estimates (test-retest, parallel/alternate forms, split-half, inter-item consistency, inter-scorer), and measurement models (Classical Test Theory, Domain Sampling Theory, Generalizability Theory, Item Response Theory), along with the Standard Error of Measurement.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

28 Terms

1
New cards

Error

The component of the observed test score that does not have to do with the test taker's ability (X = T + E).

2
New cards

Measurement error

All of the factors associated with the process of measuring some variable.

3
New cards

Random error

Source of error in measuring a targeted variable caused by unpredictable fluctuations and inconsistencies of other variables in the measurement process.

4
New cards

Systematic error

Source of error in measuring a variable that is typically constant or proportionate to what is presumed to be the true value of the variable being measured; once known, it can be predicted.

5
New cards

Item sampling (content sampling)

Variation among items within a test, identified as a source of error in test construction.

6
New cards

Test-Retest Reliability

Estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test.

7
New cards

Coefficient of stability

An estimate of test-retest reliability.

8
New cards

Parallel-Forms Reliability

A type of reliability estimated by the degree of relationship between various forms of a test, where for each form, the means and variances of observed test scores are equal.

9
New cards

Alternate-Forms Reliability

A type of reliability estimated by the degree of relationship between two different forms of a test that are similar in difficulty.

10
New cards

Coefficient of equivalence

The coefficient of reliability used to evaluate the relationship between alternate or parallel forms of a test.

11
New cards

Split-Half Reliability

Estimates reliability by correlating two pairs of scores obtained from equivalent halves of a single test administered once.

12
New cards

Spearman-Brown formula

Used to adjust split-half reliability estimates.

13
New cards

Odd-even technique

An acceptable way to split a test into equivalent halves for split-half reliability estimation by using odd-numbered items for one half and even-numbered items for the other.

14
New cards

Inter-item consistency

The degree of correlation among all the items on a scale, calculated from a single administration of a single test form.

15
New cards

Homogeneous test

A test whose items measure a single trait; the more homogeneous a test is, the more inter-item consistency it can be expected to have.

16
New cards

Kuder-Richardson formulas

Formulas used for determining the inter-item consistency of dichotomous items, typically those scored right or wrong.

17
New cards

KR21

A Kuder-Richardson formula that may be used if there is reason to assume that all test items have approximately the same degree of difficulty.

18
New cards

Coefficient alpha (Cronbach's alpha)

The mean of all possible split-half correlations, corrected by the Spearman-Brown formula; it is appropriate for nondichotomous items and is the preferred statistic for estimating internal consistency reliability.

19
New cards

Inter-Scorer Reliability (Scorer reliability, Judge reliability, Observer reliability, Inter-rater reliability)

The degree of agreement or consistency between two or more scorers (or judges or raters) with regard to a particular measure.

20
New cards

Coefficient of inter-scorer reliability

A coefficient used to quantify the degree of agreement or consistency between two or more scorers.

21
New cards

Classical Test Theory (CTT) (True score model)

A measurement model where an observed score is conceptualized as comprising a true score and error, and all test items are presumed to contribute equally to the score total.

22
New cards

True score

A value that, according to classical test theory, genuinely reflects an individual's ability (or trait) level as measured by a particular test.

23
New cards

Domain sampling theory

A theory that seeks to estimate the extent to which specific sources of variation under defined conditions contribute to the test score, conceiving test reliability as an objective measure of how precisely the test score assesses the domain from which it samples.

24
New cards

Generalizability theory

A theory suggesting a person's test scores vary from testing to testing because of variables in the testing situation, describing the 'universe' of testing in terms of its 'facets'.

25
New cards

Facets (Generalizability theory)

Components of the testing situation or 'universe' in generalizability theory, such as the number of items, amount of scorer training, and purpose of administration.

26
New cards

Item response theory (IRT)

A measurement theory that models the probability that a person with a particular ability or trait level will successfully respond to a test item, focusing on item difficulty and item discrimination.

27
New cards

Standard Error of Measurement (SEM)

A measure of the precision of an observed test score, which has an inverse relationship with the reliability of a test and is frequently used in interpreting individual scores.

28
New cards

Confidence interval

A range or band of test scores that is likely to contain the true score.