Psychological Assessment: Reliability

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/46

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

47 Terms

1
New cards

Reliability

Consistency or Dependability of a test

2
New cards

variance

(σ²)

The degree to which scores differ from the mean; it shows score variability.

3
New cards

True variance and error variance.

main sources of test score variability

4
New cards

true variance

Variance caused by actual differences in the trait being measured.

5
New cards

error variance

Variance caused by irrelevant or random factors not related to the construct.

6
New cards

Classical Test Theory

Each person has a true score (T) that would be obtained if there were no errors in measurement.

7
New cards

Error

The portion of the observed score unrelated to the construct being measured.

8
New cards

Types of error

Systematic error and random error.

9
New cards

systematic error

Consistent, predictable error that occurs in the same direction each time. Fixable.

10
New cards

random error

Errors of measurement that occur unpredictably and vary from one measurement to another.

Produces a distribution of scores around the true score.

11
New cards

Large dispersion (far-left distribution)

Single observations might fall far from the true score → less dependable.

12
New cards

Small dispersion (far-right distribution)

Observations are extremely close to the true score → fewer errors, more dependable.

13
New cards

SOURCES OF ERROR VARIANCE

Test Construction

Test Administration 

Test Scoring and Administration

Surveys and Polls as Assessment tools

SYSTEMATIC AND NONSYSTEMATIC ERRORS IN SENSITIVE ASSESSMENTS

14
New cards

test–retest reliability

Administer the same test twice to the same group, then correlate the scores.

Stability of test scores over time

For stable traits (e.g., intelligence, personality traits).

15
New cards

alternate-forms reliability

Consistency of scores between two equivalent versions of the same test.

16
New cards

parallel forms

Test versions with equal means, variances, and difficulty.

17
New cards

alternate forms

Similar but not identical versions of a test that measure the same construct.

18
New cards

item-sampling error

Score differences caused by different items in each form, not by true ability.

19
New cards

split-half reliability

Reliability estimated by correlating scores from two halves of a single test.

20
New cards

Spearman–Brown formula

It allows a test developer or user to estimate internal consistency reliability based on test length.

Whether adding or removing items will strengthen or weaken overall test reliability.

21
New cards

Reliability usually increases, but only if the new items are equivalent in content and difficulty.

What happens when tests are lengthened by adding items?

22
New cards

inter-item consistency

How strongly the items within a test correlate with one another.

23
New cards

homogeneous test

All items measure a single trait or factor.

24
New cards

Heterogenous test

measures multiple traits or factors.

25
New cards

Kuder–Richardson Formulas

Internal consistency reliability for dichotomous items (scored 0/1, e.g., true/false, right/wrong).

26
New cards

KR-21

A simplified version of KR-20 that assumes all items have equal difficulty.

less accurate but easier to compute.

27
New cards

KR-20

For tests with items scored as correct/incorrect (like multiple-choice or true/false tests)

28
New cards

Cronbach’s Alpha (α)

The average correlation among all test items — a measure of internal consistency.

29
New cards

Cronbach’s Alpha

It can be used for items with multiple scoring formats (not just dichotomous), such as Likert scales.

30
New cards

Excellent

≥ 0.90

31
New cards

May be problematic (depends on purpose)

< 0.70

32
New cards

very high α ( > 0.95)

Item redundancy — items may be too similar and not add new information.

33
New cards

Internal consistency methods (e.g., KR-20, Cronbach’s alpha).

What kind of reliability is best for homogeneous tests?

34
New cards

Test–retest reliability (since internal consistency may not apply).

Which reliability method suits heterogeneous tests better?

35
New cards

dynamic characteristics

Traits, states, or abilities that change over time (e.g., mood, anxiety).

36
New cards

Internal consistency methods

What reliability methods are better for dynamic characteristics?

37
New cards

Static Characteristics

Traits or abilities that remain relatively stable (e.g., intelligence)

38
New cards

Test–retest or alternate-forms reliability

What methods are suitable for static characteristics?

39
New cards

Restriction of range

When the sample used has limited variability (e.g., only high scorers).

40
New cards

Inflation of range

Artificially increased variability in scores, which may overestimate reliability.

41
New cards

Power test

Tests with generous time limits and varying difficulty levels; scores reflect ability, not speed.

42
New cards

Speed test

Tests with easy items but strict time limits; scores reflect speed, not ability.

43
New cards

Internal consistency methods (e.g., KR-20, Cronbach’s alpha).

What reliability methods are best for power tests?

44
New cards

Test–retest, alternate-forms, or split-half (with timed halves adjusted using Spearman–Brown).

What reliability methods are suitable for speed tests?

45
New cards

Criterion-Referenced test

Tests that measure mastery of specific skills or objectives (e.g., pass/fail).

46
New cards

Norm-Referenced Tests

Tests that compare an individual’s score to others’.

47
New cards

Traditional ones like test–retest, split-half, and KR-20.

Which reliability methods suit norm-referenced tests?