1/111
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Before the test it's important the examiner…
Select appropriate tools and ensure tools are not made known to the examinee
This is the form on which the examinee's responses are recorded.
Protocol
During the test administration, the examiner must first establish , which is .
Rapport; working relationship between the examiner + examinee
After the test administration, the examiner must then…
Safeguard the protocol, score and interpret the test accurately, and the write the report
This refers to the efforts of the examiner to arouse the test taker's interest in the test, elicit cooperation, and encourage them to respond appropriately.
Rapport
Rapport with the test-takers can influence the result.
True
What is the difference between accommodation and alternate assessment?
Accommodation: involves making adjustments to the same test for examinees with exceptional needs; Alternate assessment: involves a different method of measurement when the standard test even with adjustments isn't appropriate.
This effect refers to the steady rise in average IQ scores across generations during the 20th century.
Flynn Effect
What is the Frog Pond Effect?
The tendency to feel less capable when surrounded by higher-achieving peers, even if one's absolute ability is strong.
What is bias in testing?
The presence of systematic errors in measuring certain factors.
What is the difference between culture-free, culture-fair, and culture-loading?
Culture free: no culture involved; culture-fair: reduced bias, but some culture remains; culture-loading: degree of cultural dependence in a test.
What is another term for CTT?
True score/classical model of measurement
A value that, according to CTT, genuinely reflects an individual's ability/trait.
True score
A component that does not have anything to do with the test taker's ability.
Error
This refers to the type of error that is unpredictable.
Random error
This refers to the type of errror that is constant.
Systematic error
Under which source of error is item/content sampling?
Test construction
Room temperature, level of lighting, and the amount of ventilation and noise are variables under which test administration error?
Test environment
Emotional problems, physical discomfort, lack of sleep, effects of drugs, formal learning and causal life experiences, and etc. are variables under which test administration error?
Test taker variable
Examiner's physical appearance and demeanor, nonverbal gestures, and professionalism are variables under which test administration error?
Examiner-related variables
How does time sampling affect reliability coefficient?
Longer time intervals between test administrations reduce the reliability coefficient, since more external factors influence scores.
How does carryover effects differ from practice effects?
Carryover effects: about how performing in one test influences performance on the next test administration; Practice effects: specifically, about how familiarity in the test boosts performance in the next test administration.
This refers to the technique that helps avoid carryover effects for parallel forms/
Counterbalancing
Which reliability estimate is the most rigorous and burdensome to establish?
Parallel/alternate forms reliability
True or False: Lower SEM = higher reliability.
True
This refers to index of the amount of inconsistency or the amount of the expected score in an individual's true score.
Standard Error of Measurement
If you test a whole class, which tells you how much the class’s average score might wiggle around the true average?
Standard Error of Scores
This refers to the range or band of test scores that most likely contains the true score.
Confidence interval
This aids in measuring how much of a difference should be before it can be considered statistically significant.
Standard Error of Difference
This refers to the standard error of difference between predicted and observed values.
Standard Error of Estimation
This refers to the proportion of people that a test accurately identifies as having the trait.
Hit rate
This refers to the proportion of people that a test fails to identify as having the trait.
Miss rate
This refers to the proportion incorrectly identified as having the trait when they don't.
False alarm rate
This refers to the proportion correctly identified as not having the trait.
Correct rejection rate
True positive is also known as __.
Sensitivity
True negative is also known as __.
Specificity
This refers to the ability of the test to correctly detect those with the trait (high hit rate, low miss rate).
Sensitivity
This refers to the ability of the test to correctly exclude those without the trait (high correct rejection, low false alarm).
Specificity
This refers to the likelihood that someone identified as having the trait truly has it.
Positive Predictive Value (PPV)
This refers to the likelihood that someone identified as not having the trait truly doesn’t.
Negative Predictive Value (NPV)
False positive is also known as __.
Type I error
False negative is also known as __.
Type II error
What does type I error signify?
You conclude someone has the trait when they actually don’t
What does type II error signify?
You conclude someone does not have the trait when they actually do
Type I or Type II: Rejecting the null hypothesis when it's true.
Type I error
Type I or Type II: Failing to reject the null hypothesis when it's false.
Type II error
Type I or Type II: Rejecting the alternative hypothesis when it's false
Type II error
This refers to the risk you take of rejecting the null hypothesis when it’s actually true.
Alpha
This refers to the risk of failing to reject the null hypothesis when the alternative hypothesis is actually true.
Beta
What can be a way to reduce committing type I error?
Lower significance level from 0.05 to 0.01.
What can be a way to reduce committing type II error?
Increase the sample size to give more statistical power
What are other ways to reduce committing type I error?
Use corrections like Bonferroni adjustments; improve measurement precision to reduce random noise that triggers false positives.
What are other ways to reduce committing type II error?
Use stronger effective size detection; choose appropriate statistical tests to boost sensitivity; control extraneous variables.
What is the best way to reduce both type I and II errors?
Increase the sample size
What happens when we try to reduce type I errors by lowering the alpha?
It increases the risk of type II error.
How does lowering alpha increase the type II error?
Lowering alpha makes your test more cautious about claiming an effect exists, but that caution can lead to missing real effects — increasing Type II errors.
A measurement bias in which people change their behavior simply because they know they are being observed or measured.
Reactivity
A measurement bias in which raters start off following standardized procedures in scoring but then deviates and moves toward their idiosyncratic/personal definition of behavior.
Drift
This refers to the cognitive bias in which a rater’s evaluation of one person is distorted by comparison with another person’s performance, rather than judged independently.
Contrast effect
A form of self-fulfilling prophecy in which positive expectations from others lead to improved performance.
Rosenthal effect
A form of self-fulfilling prophecy in which negative expectations from others lead to decreased performance.
Golem effect
A form of self-fulfilling prophecy in which positive self-expectations lead to improved performance.
Galatea effect
What is another term for rosenthal effect?
Pygmalion effect
A rating bias in which a rater consistently gives higher ratings than warranted, being overly generous.
Leniency/generosity error
A rating bias in which a rater consistently gives lower ratings than warranted, being overly harsh.
Severity/strictness error
A rating bias in which a rater consistently avoids using extreme scores, clustering evaluations around the middle of the scale.
Central tendency error
A rating bias in which a rater’s overall positive impression of a person (often based on one good trait) spills over and inflates ratings on unrelated dimensions.
Halo effect
A rating bias in which a rater’s overall negative impression of a person (often based on one bad trait) spills over and deflates ratings on unrelated dimensions.
Horn effect
A bias where people over‑attribute others’ behavior to internal traits (personality, character) while underestimating situational factors/context.
Fundamental attribution error
A bias where people believe vague, general statements about personality are highly accurate and uniquely descriptive of them, even though the statements could apply to almost anyone.
Barnum effect
What is another term for barnum effect?
Aunt Fanny effect
This refers to a factor in a test that systematically prevent accurate, impartial measurement
Bias
__ is a Classical Test Theory (CTT) procedure that adjusts an observed score into an estimate of the examinee’s true score, using the test’s reliability.
Estimated true score transformation
A response bias where people alter their answers or behaviors to appear more socially acceptable, favorable, or “good” rather than giving truthful responses.
Social desirability
What occurs in level I of psychological interpretation?
Involves reporting only what is observed or looking at results at face value
What occurs in level II of psychological interpretation?
Involves looking deeper into why results are as such, interpreting underlying causes or dynamics
What occurs in level III of psychological interpretation?
Involves applying interpretation to guide intervention or prognosis, using insights to plan action
This refers to behaviors or responses shown by the examinee during the test session that go beyond the actual test content.
Extra-test behavior
Which type of interpretation involves reporting test results at face value?
Concrete interpretation
Which type of interpretation involves applying fixed rules or formulas?
Mechanical interpretation
Which type of interpretation involves tailoring the meaning of the test results to the unique context of the person?
Individualized interpretation
What is the Intuition approach in assessment interpretation?
Involves relying on the examiner’s clinical judgment, experience, and “gut feel” to interpret results.
What is the Authoritative approach in assessment interpretation?
Involves following established manuals, expert opinions, or standardized rules without much personal judgment.
What is the Empirical/Conceptual approach in assessment interpretation?
Involves basing interpretation on research evidence, theoretical frameworks, and statistical data.
What is the ultimate goal a test?
To actually serve a purpose in practice
What does psychometric soundness entail?
Reliability and validity
What are the factors the affect utility?
Psychometric soundess, cost, benefits
This refers to tables that show the probability of success at different levels of test scores/different score ranges.
Expectancy tables
This refers to tables that show how much a test improves hiring success compared to random selection, based on the percentage of hired applicants who succeed.
Taylor-Russell tables
This refers to the proportion of applicants hired out of the total applicant pool.
Selection ratio
This refers to the tables that show how much a test improves performance compared to random selection, expressed as an average gain in criterion scores.
Naylor-Shine tables
This refers to a utility formula that shows the financial or productivity gain from using a selection test.
Brogden-Cronbach-Gleser
What does BCG formula measure?
The monetary value of better hires when using a valid test compared to random selection.
This refers to the framework for analyzing and guiding choices when outcomes are uncertain.
Decision theory
What are the practical considerations in utility analysis?
The pool of job applicants, the job complexity, and the cut score.
A predetermined, absolute threshold score that all applicants must meet or exceed.
Fixed cut score
A threshold based on the performance of the applicant pool (e.g., top 20% of scores).
Relative cut score
Applicants must meet minimum scores on all predictors simultaneously.
Multiple cut-off model
Applicants must pass each predictor sequentially––failure in one stage means elimination.
Multiple hurdle model
What is the difference between multiple cut-off and multiple hurdle model?
Multiple cut-off model: all minimums at once; multiple hurdle model: step-by-step elimination