Psychometrics

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/77

flashcard set

Earn XP

Description and Tags

Midterm 2

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

78 Terms

1
New cards

X

Observed score

2
New cards

T

True score

3
New cards

E

Error

4
New cards

Systematic errors

Errors inherent in the testing environment (or scale)

5
New cards

Random errors

Mood of a participant, guessing by participants, idiosyncrasies of participants

6
New cards

High alpha

Unidimensional scale (and not the whole scale)

7
New cards

Attenuation paradox

Increasing a test's reliability can, under certain conditions, lead to a decrease in its validity

8
New cards

Cronbach's alpha Diagnostic and education

.95, want it to be really high

9
New cards

Cronbach's alpha Research scales

.8

10
New cards

Spearman-Brown Formula

Benefits temper off at 19 items

11
New cards

Percent or raw agreement

Calculates the percentage of exact agreement for any type of scale or ratings

12
New cards

Kappa

For categorical ratings, takes into account the agreement between raters based on chance

13
New cards

Kappa 0.61 - 0.80

Substantial agreement

14
New cards

Kappa 0.81 - 1.00

Almost perfect agreement

15
New cards

Pearson's correlation

For interval or ratio data and two raters, assess the consistency across those raters (not exact agreement)

16
New cards

PC Very High/Strong

.7-.9

17
New cards

PC Moderate

.3-.7

18
New cards

PC Weak

.1-.3

19
New cards

ICC

For interval/ratio data, assess consistency across raters (not exact agreement)

20
New cards

ICC 0.75 to 0.90

Good reliability

21
New cards

ICC Greater than 0.90

Excellent reliability

22
New cards

Reliability

Patterns or correlations

23
New cards

Agreement

Exact same ratings

24
New cards

Validity

The degree to which evidence and theory support the interpretations of test scores for proposed uses of tests

25
New cards

Classic view

Tripartite view, 3 c's, criterion, content, and construct validity

26
New cards

Unified understanding of validity

Evidence based on test-criterion relations, Evidence based on response processes, Evidence based on relations to other variables, Evidence based on consequences of testing

27
New cards

Nominalist fallacy

Assumption that a test measures a construct simply because it is labeled as such

28
New cards

Factor analysis

Number and nature of latent factors (dimensions) underlying a set of items or variables

29
New cards

EFA

Exploratory factor analysis, generally used when limited research or theory is available

30
New cards

CFA

Generally used when research and theory are available and model specifications can be made

31
New cards

Advantages of CFA

Model specifications can be controlled by the researcher especially by knowing which items load onto which factors, how factors are correlated, and if measurement error variances covary

32
New cards

Factor loadings

Indicate the relationship between variables/items and their underlying latent factors

33
New cards

Eigenvalues

Indicate the amount of variance in the observed variables/items accounted for by each factor. Above 1 is valid

34
New cards

K1 Criterion

Components with an eigenvalue greater than one should be retained

35
New cards

Communalities

Measure the shared variance among items

36
New cards

Scree plot

Elbow, above 1 to count

37
New cards

One parameter

Difficulty or b-parameter, discrimination is fixed across all items

38
New cards

Two parameter

Difficulty or b-parameter, discrimination or a-parameter

39
New cards

Rasch model

Difficulty or b parameter, discrimination is fixed across all items as 1

40
New cards

Three parameter

b parameter, a parameter, c parameter

41
New cards

Step difficulties

Indicate the points on the ability scale which responses in adjacent categories are equally likely

42
New cards

Graded response model

Getting a higher score or getting a score

43
New cards

Partial credit model

Step difficulties

44
New cards

Generalized partial credit model

Like partial credit model but assumes that discrimination (a parameter) varies across items

45
New cards

A multidimensional IRT model

Used when items measure multiple distinct, often correlated, latent traits or dimensions

46
New cards

Threshold (b-parameter)

Indicates the point at which the probability of scoring in category k or higher is equal to 75%.

47
New cards

A parameter

Discrimination (the steepness of the IRF slope).

48
New cards

B parameter

Difficulty (the point of inflection of the IRF curve).

49
New cards

C parameter

Guessing (the point on the Y-axis where the curve tamps off to the left).

50
New cards

Orthogonal rotations

Factors assumed to be uncorrelated; examples include Varimax & Quartimax.

51
New cards

Oblique rotations

Factors assumed to be correlated; examples include Oblimin & Promax.

52
New cards

SRMR

Acceptable <.1, good fit <.08.

53
New cards

RMSEA

Acceptable <.08, good <.06.

54
New cards

CFI

Acceptable <.9, good <.95.

55
New cards

Good model fit

Means that the factor model reproduces the observed correlations well.

56
New cards

Unidentified model

Does not have enough information to estimate its parameters uniquely; it lacks the necessary degrees of freedom (Df <0).

57
New cards

Just-identified model

Has exactly enough information to estimate all its parameters (Df =0).

58
New cards

Overidentified model

Has more information than needed to estimate its parameters, allowing for testing the model's fit to the data (Df >0).

59
New cards

Model identification

Calculated as v(v+1)/2.

60
New cards

V

Number of items.

61
New cards

Df

v-needed, amount of money needed to spread evenly across all items.

62
New cards

Test retest reliability

Across time measurement (coefficient of stability only works for stable concepts).

63
New cards

Internal consistency

Across items measurement (alpha, omega).

64
New cards

Coefficient of equivalence

Across scales measurement.

65
New cards

Interrater agreement

Across raters measurement.

66
New cards

Cronbach's Alpha

Measure of internal consistency, should only be used with parallel and (essentially) tau-equivalent items.

67
New cards

Parallel items

Identical in every psychometric property.

68
New cards

Tau-equivalent items

Have equal true-score means and equal factor loadings, but their error variances can differ.

69
New cards

Essentially Tau Equivalent

Same as tau equivalent but allows different intercepts.

70
New cards

Congeneric items

Measure the same construct but with different loadings and intercepts.

71
New cards

Cronbach's A .70

For scales in the initial stages of development.

72
New cards

Cronbach's A .80

For basic research scales.

73
New cards

Cronbach's A .95

For individual diagnostic scales.

74
New cards

Noncognitive scales

In the .80s.

75
New cards

Cognitive scales

In the .90s.

76
New cards

1PL model

Assumes all items have the same discrimination power.

77
New cards

2PL model

Believes items vary in their ability to discriminate between individuals of different ability levels.

78
New cards

3PL model

Includes multiple-choice items where guessing might occur.