Test Administration, Scoring, Interpretation and Usage

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/111

There's no tags or description

Looks like no tags are added yet.

Last updated 12:19 PM on 6/21/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

112 Terms

New cards

Before the test it's important the examiner…

Select appropriate tools and ensure tools are not made known to the examinee

New cards

This is the form on which the examinee's responses are recorded.

Protocol

New cards

During the test administration, the examiner must first establish , which is .

Rapport; working relationship between the examiner + examinee

New cards

After the test administration, the examiner must then…

Safeguard the protocol, score and interpret the test accurately, and the write the report

New cards

This refers to the efforts of the examiner to arouse the test taker's interest in the test, elicit cooperation, and encourage them to respond appropriately.

Rapport

New cards

Rapport with the test-takers can influence the result.

True

New cards

What is the difference between accommodation and alternate assessment?

Accommodation: involves making adjustments to the same test for examinees with exceptional needs; Alternate assessment: involves a different method of measurement when the standard test even with adjustments isn't appropriate.

New cards

This effect refers to the steady rise in average IQ scores across generations during the 20th century.

Flynn Effect

New cards

What is the Frog Pond Effect?

The tendency to feel less capable when surrounded by higher-achieving peers, even if one's absolute ability is strong.

New cards

What is bias in testing?

The presence of systematic errors in measuring certain factors.

New cards

What is the difference between culture-free, culture-fair, and culture-loading?

Culture free: no culture involved; culture-fair: reduced bias, but some culture remains; culture-loading: degree of cultural dependence in a test.

New cards

What is another term for CTT?

True score/classical model of measurement

New cards

A value that, according to CTT, genuinely reflects an individual's ability/trait.

True score

New cards

A component that does not have anything to do with the test taker's ability.

Error

New cards

This refers to the type of error that is unpredictable.

Random error

New cards

This refers to the type of errror that is constant.

Systematic error

New cards

Under which source of error is item/content sampling?

Test construction

New cards

Room temperature, level of lighting, and the amount of ventilation and noise are variables under which test administration error?

Test environment

New cards

Emotional problems, physical discomfort, lack of sleep, effects of drugs, formal learning and causal life experiences, and etc. are variables under which test administration error?

Test taker variable

New cards

Examiner's physical appearance and demeanor, nonverbal gestures, and professionalism are variables under which test administration error?

Examiner-related variables

New cards

How does time sampling affect reliability coefficient?

Longer time intervals between test administrations reduce the reliability coefficient, since more external factors influence scores.

New cards

How does carryover effects differ from practice effects?

Carryover effects: about how performing in one test influences performance on the next test administration; Practice effects: specifically, about how familiarity in the test boosts performance in the next test administration.

New cards

This refers to the technique that helps avoid carryover effects for parallel forms/

Counterbalancing

New cards

Which reliability estimate is the most rigorous and burdensome to establish?

Parallel/alternate forms reliability

New cards

True or False: Lower SEM = higher reliability.

True

New cards

This refers to index of the amount of inconsistency or the amount of the expected score in an individual's true score.

Standard Error of Measurement

New cards

If you test a whole class, which tells you how much the class’s average score might wiggle around the true average?

Standard Error of Scores

New cards

This refers to the range or band of test scores that most likely contains the true score.

Confidence interval

New cards

This aids in measuring how much of a difference should be before it can be considered statistically significant.

Standard Error of Difference

New cards

This refers to the standard error of difference between predicted and observed values.

Standard Error of Estimation

New cards

This refers to the proportion of people that a test accurately identifies as having the trait.

Hit rate

New cards

This refers to the proportion of people that a test fails to identify as having the trait.

Miss rate

New cards

This refers to the proportion incorrectly identified as having the trait when they don't.

False alarm rate

New cards

This refers to the proportion correctly identified as not having the trait.

Correct rejection rate

New cards

True positive is also known as __.

Sensitivity

New cards

True negative is also known as __.

Specificity

New cards

This refers to the ability of the test to correctly detect those with the trait (high hit rate, low miss rate).

Sensitivity

New cards

This refers to the ability of the test to correctly exclude those without the trait (high correct rejection, low false alarm).

Specificity

New cards

This refers to the likelihood that someone identified as having the trait truly has it.

Positive Predictive Value (PPV)

New cards

This refers to the likelihood that someone identified as not having the trait truly doesn’t.

Negative Predictive Value (NPV)

New cards

False positive is also known as __.

Type I error

New cards

False negative is also known as __.

Type II error

New cards

What does type I error signify?

You conclude someone has the trait when they actually don’t

New cards

What does type II error signify?

You conclude someone does not have the trait when they actually do

New cards

Type I or Type II: Rejecting the null hypothesis when it's true.

Type I error

New cards

Type I or Type II: Failing to reject the null hypothesis when it's false.

Type II error

New cards

Type I or Type II: Rejecting the alternative hypothesis when it's false

Type II error

New cards

This refers to the risk you take of rejecting the null hypothesis when it’s actually true.

Alpha

New cards

This refers to the risk of failing to reject the null hypothesis when the alternative hypothesis is actually true.

Beta

New cards

What can be a way to reduce committing type I error?

Lower significance level from 0.05 to 0.01.

New cards

What can be a way to reduce committing type II error?

Increase the sample size to give more statistical power

New cards

What are other ways to reduce committing type I error?

Use corrections like Bonferroni adjustments; improve measurement precision to reduce random noise that triggers false positives.

New cards

What are other ways to reduce committing type II error?

Use stronger effective size detection; choose appropriate statistical tests to boost sensitivity; control extraneous variables.

New cards

What is the best way to reduce both type I and II errors?

Increase the sample size

New cards

What happens when we try to reduce type I errors by lowering the alpha?

It increases the risk of type II error.

New cards

How does lowering alpha increase the type II error?

Lowering alpha makes your test more cautious about claiming an effect exists, but that caution can lead to missing real effects — increasing Type II errors.

New cards

A measurement bias in which people change their behavior simply because they know they are being observed or measured.

Reactivity

New cards

A measurement bias in which raters start off following standardized procedures in scoring but then deviates and moves toward their idiosyncratic/personal definition of behavior.

Drift

New cards

This refers to the cognitive bias in which a rater’s evaluation of one person is distorted by comparison with another person’s performance, rather than judged independently.

Contrast effect

New cards

A form of self-fulfilling prophecy in which positive expectations from others lead to improved performance.

Rosenthal effect

New cards

A form of self-fulfilling prophecy in which negative expectations from others lead to decreased performance.

Golem effect

New cards

A form of self-fulfilling prophecy in which positive self-expectations lead to improved performance.

Galatea effect

New cards

What is another term for rosenthal effect?

Pygmalion effect

New cards

A rating bias in which a rater consistently gives higher ratings than warranted, being overly generous.

Leniency/generosity error

New cards

A rating bias in which a rater consistently gives lower ratings than warranted, being overly harsh.

Severity/strictness error

New cards

A rating bias in which a rater consistently avoids using extreme scores, clustering evaluations around the middle of the scale.

Central tendency error

New cards

A rating bias in which a rater’s overall positive impression of a person (often based on one good trait) spills over and inflates ratings on unrelated dimensions.

Halo effect

New cards

A rating bias in which a rater’s overall negative impression of a person (often based on one bad trait) spills over and deflates ratings on unrelated dimensions.

Horn effect

New cards

A bias where people over‑attribute others’ behavior to internal traits (personality, character) while underestimating situational factors/context.

Fundamental attribution error

New cards

A bias where people believe vague, general statements about personality are highly accurate and uniquely descriptive of them, even though the statements could apply to almost anyone.

Barnum effect

New cards

What is another term for barnum effect?

Aunt Fanny effect

New cards

This refers to a factor in a test that systematically prevent accurate, impartial measurement

Bias

New cards

__ is a Classical Test Theory (CTT) procedure that adjusts an observed score into an estimate of the examinee’s true score, using the test’s reliability.

Estimated true score transformation

New cards

A response bias where people alter their answers or behaviors to appear more socially acceptable, favorable, or “good” rather than giving truthful responses.

Social desirability

New cards

What occurs in level I of psychological interpretation?

Involves reporting only what is observed or looking at results at face value

New cards

What occurs in level II of psychological interpretation?

Involves looking deeper into why results are as such, interpreting underlying causes or dynamics

New cards

What occurs in level III of psychological interpretation?

Involves applying interpretation to guide intervention or prognosis, using insights to plan action

New cards

This refers to behaviors or responses shown by the examinee during the test session that go beyond the actual test content.

Extra-test behavior

New cards

Which type of interpretation involves reporting test results at face value?

Concrete interpretation

New cards

Which type of interpretation involves applying fixed rules or formulas?

Mechanical interpretation

New cards

Which type of interpretation involves tailoring the meaning of the test results to the unique context of the person?

Individualized interpretation

New cards

What is the Intuition approach in assessment interpretation?

Involves relying on the examiner’s clinical judgment, experience, and “gut feel” to interpret results.

New cards

What is the Authoritative approach in assessment interpretation?

Involves following established manuals, expert opinions, or standardized rules without much personal judgment.

New cards

What is the Empirical/Conceptual approach in assessment interpretation?

Involves basing interpretation on research evidence, theoretical frameworks, and statistical data.

New cards

What is the ultimate goal a test?

To actually serve a purpose in practice

New cards

What does psychometric soundness entail?

Reliability and validity

New cards

What are the factors the affect utility?

Psychometric soundess, cost, benefits

New cards

This refers to tables that show the probability of success at different levels of test scores/different score ranges.

Expectancy tables

New cards

This refers to tables that show how much a test improves hiring success compared to random selection, based on the percentage of hired applicants who succeed.

Taylor-Russell tables

New cards

This refers to the proportion of applicants hired out of the total applicant pool.

Selection ratio

New cards

This refers to the tables that show how much a test improves performance compared to random selection, expressed as an average gain in criterion scores.

Naylor-Shine tables

New cards

This refers to a utility formula that shows the financial or productivity gain from using a selection test.

Brogden-Cronbach-Gleser

New cards

What does BCG formula measure?

The monetary value of better hires when using a valid test compared to random selection.

New cards

This refers to the framework for analyzing and guiding choices when outcomes are uncertain.

Decision theory

New cards

What are the practical considerations in utility analysis?

The pool of job applicants, the job complexity, and the cut score.

New cards

A predetermined, absolute threshold score that all applicants must meet or exceed.

Fixed cut score

New cards

A threshold based on the performance of the applicant pool (e.g., top 20% of scores).

Relative cut score

New cards

Applicants must meet minimum scores on all predictors simultaneously.

Multiple cut-off model

New cards

Applicants must pass each predictor sequentially––failure in one stage means elimination.

Multiple hurdle model

100

New cards

What is the difference between multiple cut-off and multiple hurdle model?

Multiple cut-off model: all minimums at once; multiple hurdle model: step-by-step elimination