Test Administration, Scoring, Interpretation and Usage

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/111

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 12:19 PM on 6/21/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

112 Terms

1
New cards

Before the test it's important the examiner…

Select appropriate tools and ensure tools are not made known to the examinee

2
New cards

This is the form on which the examinee's responses are recorded.

Protocol

3
New cards

During the test administration, the examiner must first establish , which is .

Rapport; working relationship between the examiner + examinee

4
New cards

After the test administration, the examiner must then…

Safeguard the protocol, score and interpret the test accurately, and the write the report

5
New cards

This refers to the efforts of the examiner to arouse the test taker's interest in the test, elicit cooperation, and encourage them to respond appropriately.

Rapport

6
New cards

Rapport with the test-takers can influence the result.

True

7
New cards

What is the difference between accommodation and alternate assessment?

Accommodation: involves making adjustments to the same test for examinees with exceptional needs; Alternate assessment: involves a different method of measurement when the standard test even with adjustments isn't appropriate.

8
New cards

This effect refers to the steady rise in average IQ scores across generations during the 20th century.

Flynn Effect

9
New cards

What is the Frog Pond Effect?

The tendency to feel less capable when surrounded by higher-achieving peers, even if one's absolute ability is strong.

10
New cards

What is bias in testing?

The presence of systematic errors in measuring certain factors.

11
New cards

What is the difference between culture-free, culture-fair, and culture-loading?

Culture free: no culture involved; culture-fair: reduced bias, but some culture remains; culture-loading: degree of cultural dependence in a test.

12
New cards

What is another term for CTT?

True score/classical model of measurement

13
New cards

A value that, according to CTT, genuinely reflects an individual's ability/trait.

True score

14
New cards

A component that does not have anything to do with the test taker's ability.

Error

15
New cards

This refers to the type of error that is unpredictable.

Random error

16
New cards

This refers to the type of errror that is constant.

Systematic error

17
New cards

Under which source of error is item/content sampling?

Test construction

18
New cards

Room temperature, level of lighting, and the amount of ventilation and noise are variables under which test administration error?

Test environment

19
New cards

Emotional problems, physical discomfort, lack of sleep, effects of drugs, formal learning and causal life experiences, and etc. are variables under which test administration error?

Test taker variable

20
New cards

Examiner's physical appearance and demeanor, nonverbal gestures, and professionalism are variables under which test administration error?

Examiner-related variables

21
New cards

How does time sampling affect reliability coefficient?

Longer time intervals between test administrations reduce the reliability coefficient, since more external factors influence scores.

22
New cards

How does carryover effects differ from practice effects?

Carryover effects: about how performing in one test influences performance on the next test administration; Practice effects: specifically, about how familiarity in the test boosts performance in the next test administration.

23
New cards

This refers to the technique that helps avoid carryover effects for parallel forms/

Counterbalancing

24
New cards

Which reliability estimate is the most rigorous and burdensome to establish?

Parallel/alternate forms reliability

25
New cards

True or False: Lower SEM = higher reliability.

True

26
New cards

This refers to index of the amount of inconsistency or the amount of the expected score in an individual's true score.

Standard Error of Measurement

27
New cards

If you test a whole class, which tells you how much the class’s average score might wiggle around the true average?

Standard Error of Scores

28
New cards

This refers to the range or band of test scores that most likely contains the true score.

Confidence interval

29
New cards

This aids in measuring how much of a difference should be before it can be considered statistically significant.

Standard Error of Difference

30
New cards

This refers to the standard error of difference between predicted and observed values.

Standard Error of Estimation

31
New cards

This refers to the proportion of people that a test accurately identifies as having the trait.

Hit rate

32
New cards

This refers to the proportion of people that a test fails to identify as having the trait.

Miss rate

33
New cards

This refers to the proportion incorrectly identified as having the trait when they don't.

False alarm rate

34
New cards

This refers to the proportion correctly identified as not having the trait.

Correct rejection rate

35
New cards

True positive is also known as __.

Sensitivity

36
New cards

True negative is also known as __.

Specificity

37
New cards

This refers to the ability of the test to correctly detect those with the trait (high hit rate, low miss rate).

Sensitivity

38
New cards

This refers to the ability of the test to correctly exclude those without the trait (high correct rejection, low false alarm).

Specificity

39
New cards

This refers to the likelihood that someone identified as having the trait truly has it.

Positive Predictive Value (PPV)

40
New cards

This refers to the likelihood that someone identified as not having the trait truly doesn’t.

Negative Predictive Value (NPV)

41
New cards

False positive is also known as __.

Type I error

42
New cards

False negative is also known as __.

Type II error

43
New cards

What does type I error signify?

You conclude someone has the trait when they actually don’t

44
New cards

What does type II error signify?

You conclude someone does not have the trait when they actually do

45
New cards

Type I or Type II: Rejecting the null hypothesis when it's true.

Type I error

46
New cards

Type I or Type II: Failing to reject the null hypothesis when it's false.

Type II error

47
New cards

Type I or Type II: Rejecting the alternative hypothesis when it's false

Type II error

48
New cards

This refers to the risk you take of rejecting the null hypothesis when it’s actually true.

Alpha

49
New cards

This refers to the risk of failing to reject the null hypothesis when the alternative hypothesis is actually true.

Beta

50
New cards

What can be a way to reduce committing type I error?

Lower significance level from 0.05 to 0.01.

51
New cards

What can be a way to reduce committing type II error?

Increase the sample size to give more statistical power

52
New cards

What are other ways to reduce committing type I error?

Use corrections like Bonferroni adjustments; improve measurement precision to reduce random noise that triggers false positives.

53
New cards

What are other ways to reduce committing type II error?

Use stronger effective size detection; choose appropriate statistical tests to boost sensitivity; control extraneous variables.

54
New cards

What is the best way to reduce both type I and II errors?

Increase the sample size

55
New cards

What happens when we try to reduce type I errors by lowering the alpha?

It increases the risk of type II error.

56
New cards

How does lowering alpha increase the type II error?

Lowering alpha makes your test more cautious about claiming an effect exists, but that caution can lead to missing real effects — increasing Type II errors.

57
New cards

A measurement bias in which people change their behavior simply because they know they are being observed or measured.

Reactivity

58
New cards

A measurement bias in which raters start off following standardized procedures in scoring but then deviates and moves toward their idiosyncratic/personal definition of behavior.

Drift

59
New cards

This refers to the cognitive bias in which a rater’s evaluation of one person is distorted by comparison with another person’s performance, rather than judged independently.

Contrast effect

60
New cards

A form of self-fulfilling prophecy in which positive expectations from others lead to improved performance.

Rosenthal effect

61
New cards

A form of self-fulfilling prophecy in which negative expectations from others lead to decreased performance.

Golem effect

62
New cards

A form of self-fulfilling prophecy in which positive self-expectations lead to improved performance.

Galatea effect

63
New cards

What is another term for rosenthal effect?

Pygmalion effect

64
New cards

A rating bias in which a rater consistently gives higher ratings than warranted, being overly generous.

Leniency/generosity error

65
New cards

A rating bias in which a rater consistently gives lower ratings than warranted, being overly harsh.

Severity/strictness error

66
New cards

A rating bias in which a rater consistently avoids using extreme scores, clustering evaluations around the middle of the scale.

Central tendency error

67
New cards

A rating bias in which a rater’s overall positive impression of a person (often based on one good trait) spills over and inflates ratings on unrelated dimensions.

Halo effect

68
New cards

A rating bias in which a rater’s overall negative impression of a person (often based on one bad trait) spills over and deflates ratings on unrelated dimensions.

Horn effect

69
New cards

A bias where people over‑attribute others’ behavior to internal traits (personality, character) while underestimating situational factors/context.

Fundamental attribution error

70
New cards

A bias where people believe vague, general statements about personality are highly accurate and uniquely descriptive of them, even though the statements could apply to almost anyone.

Barnum effect

71
New cards

What is another term for barnum effect?

Aunt Fanny effect

72
New cards

This refers to a factor in a test that systematically prevent accurate, impartial measurement

Bias

73
New cards

__ is a Classical Test Theory (CTT) procedure that adjusts an observed score into an estimate of the examinee’s true score, using the test’s reliability.

Estimated true score transformation

74
New cards

A response bias where people alter their answers or behaviors to appear more socially acceptable, favorable, or “good” rather than giving truthful responses.

Social desirability

75
New cards

What occurs in level I of psychological interpretation?

Involves reporting only what is observed or looking at results at face value

76
New cards

What occurs in level II of psychological interpretation?

Involves looking deeper into why results are as such, interpreting underlying causes or dynamics

77
New cards

What occurs in level III of psychological interpretation?

Involves applying interpretation to guide intervention or prognosis, using insights to plan action

78
New cards

This refers to behaviors or responses shown by the examinee during the test session that go beyond the actual test content.

Extra-test behavior

79
New cards

Which type of interpretation involves reporting test results at face value?

Concrete interpretation

80
New cards

Which type of interpretation involves applying fixed rules or formulas?

Mechanical interpretation

81
New cards

Which type of interpretation involves tailoring the meaning of the test results to the unique context of the person?

Individualized interpretation

82
New cards

What is the Intuition approach in assessment interpretation?

Involves relying on the examiner’s clinical judgment, experience, and “gut feel” to interpret results.

83
New cards

What is the Authoritative approach in assessment interpretation?

Involves following established manuals, expert opinions, or standardized rules without much personal judgment.

84
New cards

What is the Empirical/Conceptual approach in assessment interpretation?

Involves basing interpretation on research evidence, theoretical frameworks, and statistical data.

85
New cards

What is the ultimate goal a test?

To actually serve a purpose in practice

86
New cards

What does psychometric soundness entail?

Reliability and validity

87
New cards

What are the factors the affect utility?

Psychometric soundess, cost, benefits

88
New cards

This refers to tables that show the probability of success at different levels of test scores/different score ranges.

Expectancy tables

89
New cards

This refers to tables that show how much a test improves hiring success compared to random selection, based on the percentage of hired applicants who succeed.

Taylor-Russell tables

90
New cards

This refers to the proportion of applicants hired out of the total applicant pool.

Selection ratio

91
New cards

This refers to the tables that show how much a test improves performance compared to random selection, expressed as an average gain in criterion scores.

Naylor-Shine tables

92
New cards

This refers to a utility formula that shows the financial or productivity gain from using a selection test.

Brogden-Cronbach-Gleser

93
New cards

What does BCG formula measure?

The monetary value of better hires when using a valid test compared to random selection.

94
New cards

This refers to the framework for analyzing and guiding choices when outcomes are uncertain.

Decision theory

95
New cards

What are the practical considerations in utility analysis?

The pool of job applicants, the job complexity, and the cut score.

96
New cards

A predetermined, absolute threshold score that all applicants must meet or exceed.

Fixed cut score

97
New cards

A threshold based on the performance of the applicant pool (e.g., top 20% of scores).

Relative cut score

98
New cards

Applicants must meet minimum scores on all predictors simultaneously.

Multiple cut-off model

99
New cards

Applicants must pass each predictor sequentially––failure in one stage means elimination.

Multiple hurdle model

100
New cards

What is the difference between multiple cut-off and multiple hurdle model?

Multiple cut-off model: all minimums at once; multiple hurdle model: step-by-step elimination