Chapter 1: Test Reliability and Validity Concepts

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/89

flashcard set

Earn XP

Description and Tags

These flashcards cover key concepts related to reliability, validity, and measurement in psychometrics, as well as the properties and methods associated with effective testing.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

90 Terms

1
New cards

Reliability

Dependability or consistency of a test's scores.

2
New cards

Reliability Coefficient

An index that indicates the ratio between true score variance and total variance.

3
New cards

Classical Test Theory

Theory stating a score on a test reflects both true ability and error.

4
New cards

Measurement Error

Factors in measuring a variable that are unrelated to the variable of interest.

5
New cards

Type I Error

Rejecting a true null hypothesis, also known as a false positive.

6
New cards

Type II Error

Failing to reject a false null hypothesis, also known as a false negative.

7
New cards

Variance

A measure of the dispersion of scores on a test.

8
New cards

Reliability Coefficient Interpretation

A higher coefficient indicates greater reliability.

9
New cards

Random Error

Error caused by unpredictable fluctuations during measurement.

10
New cards

Systematic Error

Error that is constant or proportional to the true value being measured.

11
New cards

Item Sampling

Variation among items within a test affecting test-taker's score.

12
New cards

Test Administration Variables

Factors like motivation or environment that can influence test scores.

13
New cards

Test-Retest Reliability

Reliability estimated by correlating scores from two test administrations.

14
New cards

Parallel Forms Reliability

Reliability assessed using different forms of the test that measure the same ability.

15
New cards

Factorial Analysis

Analysis used to identify underlying factors or traits from test scores.

16
New cards

Split-Half Reliability

Reliability obtained by correlating scores from two halves of a single test.

17
New cards

Spearman-Brown Formula

Used to estimate internal consistency reliability from two test halves.

18
New cards

Inter-item Consistency

Degree of correlation among all items on a scale.

19
New cards

Coefficient Alpha

A measure used for internal consistency reliability in non-dichotomous tests.

20
New cards

Inter-scorer Reliability

Degree of agreement between scores assigned by different raters.

21
New cards

Kappa Statistic

Used to measure agreement between two or more raters.

22
New cards

Static Traits

Traits observed as relatively unchanging over time.

23
New cards

Dynamic Traits

Traits prone to change due to situational factors.

24
New cards

Criterion-Referenced Tests

Tests designed to indicate an individual's standing relative to a criterion.

25
New cards

Standard Error of Measurement (SEM)

Estimate of the amount of error in an observed test score.

26
New cards

Confidence Interval

Range of test scores likely to contain the true score.

27
New cards

Content Validity

Assessment of how adequately a test samples the behavior it is designed to measure.

28
New cards

Criterion-Related Validity

Judgment of a test's ability to predict expected outcomes on a criterion.

29
New cards

Convergent Evidence

High correlation between test scores and similar constructs.

30
New cards

Discriminant Evidence

Low correlation between test scores and unrelated constructs.

31
New cards

Face Validity

Extent to which a test appears to measure what it claims to measure.

32
New cards

Incremental Validity

The additional predictive power of a new test over existing tests.

33
New cards

Item Difficulty Index

Reflects the proportion of test-takers who answered an item correctly.

34
New cards

Item Discrimination Index

Measures how well an item differentiates between high and low scores.

35
New cards

Item Response Theory (IRT)

Models the relationship between test-taker ability and probability of correct answers.

36
New cards

Item Characteristic Curve

Graph plotting the probability of a correct response against latent trait levels.

37
New cards

Rasch Model

A one-parameter model in IRT focusing on item difficulty.

38
New cards

Utility

Practical value of a test in improving decision-making.

39
New cards

Expectancy Data

Tables indicating likely scores of test-takers based on prior data.

40
New cards

Selection Ratio

Ratio reflecting the number of people to hire versus available applicants.

41
New cards

Cut Score

Judgment-based reference point dividing data into classifications.

42
New cards

Multiple Hurdle Method

Multi-stage selection with cut scores for each predictor.

43
New cards

Angoff Method

Method for setting cut scores with panels of expert judges.

44
New cards

Known Groups Method

Collection of data to determine cut scores based on known traits.

45
New cards

Discriminant Analysis

Technique analyzing relationships between variables and group membership.

46
New cards

Validity Coefficient

Correlation measuring the relationship between test scores and criterion.

47
New cards

Test Blueprint

Plan detailing item coverage and organization in a test.

48
New cards

Homogeneity

Extent to which a test contains items measuring a single trait.

49
New cards

Heterogeneity

Degree to which a test measures multiple traits.

50
New cards

Leniency Error

Rater's tendency to score more generously.

51
New cards

Central Tendency Error

Tendency of raters to avoid extreme scores.

52
New cards

Halo Effect

Bias where overall impression affects specific ratings.

53
New cards

Test Reliability

Consistency of test scores across different occasions.

54
New cards

Observed Score

The actual score obtained in a test.

55
New cards

True Score

The score that reflects an individual's actual ability.

56
New cards

Errors of Measurement

Factors causing discrepancies between observed and true scores.

57
New cards

Criterion Contamination

When criterion measures include irrelevant factors.

58
New cards

Base Rate

Percentage of the population exhibiting a specific attribute.

59
New cards

Hit Rate

Proportion of correct identifications of a trait by a test.

60
New cards

Miss Rate

Rate at which a test fails to identify a specific trait.

61
New cards

False Positive

Incorrect identification of a trait when it is not present.

62
New cards

False Negative

Failure to identify a trait that is actually present.

63
New cards

Dynamic Assessment

Assessment considering the variability of traits over time.

64
New cards

Test Administration Procedures

Standardized conditions to minimize variability in test scores.

65
New cards

Criteria for Good Tests

Purpose, content, administration, and scoring standards.

66
New cards

Test Specification

Detailed description of what elements a test will assess.

67
New cards

Operational Definition

Clear description of the constructs being measured.

68
New cards

Validity and Bias

Assessment of fairness and accuracy in measuring constructs.

69
New cards

Psychometric Properties

Key characteristics that ensure a test's effectiveness.

70
New cards

Cost-Benefit Analysis in Testing

Financial and practical evaluation of testing methods.

71
New cards

Test Length Impact

Longer tests tend to have higher reliability.

72
New cards

Internal Consistency Reliability

Measure of item intercorrelation within a single test.

73
New cards

Test Performance Stability

Reliability of a test over time for stable traits.

74
New cards

Item Content Sampling

Ensuring item relevance to the construct being measured.

75
New cards

Aptitude Test

Test designed to predict an individual's potential for success.

76
New cards

Personality Test

Assessment of character traits and behavior patterns.

77
New cards

Achievement Test

Evaluation of knowledge or skills in a particular area.

78
New cards

Cohen's Kappa

Statistic for measuring inter-rater agreement, correcting for chance.

79
New cards

Fleiss Kappa

Kappa statistic for agreement across multiple raters.

80
New cards

Graphical Representation of Test Scores

Visual tools to interpret performance data in assessments.

81
New cards

Behavioral Observation

Process of tracking actions or responses in a test setting.

82
New cards

Test Revision Procedures

Methods for improving test items based on item analysis.

83
New cards

Extreme Group Method

Comparative analysis using separate groups based on test performance.

84
New cards

Reliability in Psychological Measurement

Accuracy and consistency of psychological evaluations.

85
New cards

Latent Trait Theory

Theoretical framework explaining unobservable characteristics.

86
New cards

Assessment of Test Quality

Evaluation criteria to ensure effective testing.

87
New cards

Implementation of IRT

Integrating item response theory into testing procedures.

88
New cards

Decision Study

Analysis focused on how test scores aid decision-making.

89
New cards

Test User Guidelines

Standards for administering and interpreting test results.

90
New cards

Multivariate Analysis Techniques

Complex statistical methods for analyzing multiple variables.