Cost efficiency, Time factor, How a valid and reliable test compares to another valid and reliable test, How useful is a test for diagnoses/treatment/classifying patients, How an admissions test can “trim down” # of applicants, Will it help that we add another test to our battery, Having a test VS not having a test
Culture-Fair Test
Scores are unaffected by cultural differences.
Culture-Bound Test
Test that is specific to another language or culture.
Test Utility
The usefulness or practical value of testing to improve efficiency.
A test being both reliable and valid does not necessarily mean or ensure that it is useful.
Primarily designed to assess clinical conditions (e.g., depression, schizophrenia, paranoia, psychopathic deviate, hypomania, etc.).
Not effective for evaluating non-clinical or normal populations.
Specifically useful for clinical diagnosis rather than general personality assessment.
567 items
Measures five major personality domains, often referred to as OCEAN.
In certain contexts, this test can be more useful than the MMPI, especially when assessing normal populations or when the focus is on general personality traits rather than clinical conditions.
240 items
Psychometric Soundness, Costs, Benefits
Psychometric Soundness
The reliability and validity of a test (i.e., reliability and validity coefficients are acceptably high).
Higher than .95 reliability = questionable
Index of Utility
The practical value of the information derived from scores on a test (considering both reliability and validity).
Convergent Validity
This is when scores from one test positively correlate with scores from another similar test. Excessively high ___________ can be concerning, as it suggests that the new test offers no additional insights or unique information, which reduces its utility.
Normally, a ______ test is most likely going to be useful. But there are other factors that must be considered in determining a test’s utility.
Disadvantages, losses, or expenses in both economic and non-economic terms.
The usual meaning, of course, is economic.
Profits, gains, or advantages derived from the use of a particular test.
While testing can have some cost to the company, the economic benefits can be tremendous, in terms of:
Increase in quantity and quality of worker performance;
Decrease in competency gaps (requiring training), accidents, and employee turnover.
Use of Expectancy Data, Use of Brogden-Cronbach-Gleser Formula, Decision Theory and Test Utility, Some Practical Considerations
4 ways to conduct Utility Analysis
Expectancy Table
Provides an indication of the likelihood that a test-taker will score within some interval of scores on a criterion measure — “passing,” “acceptable,” or “failing.”
Can provide information helpful to decision makers.
Taylor-Russell Table
Typically used to help decide if a test is worth using for hiring employees based on how well it predicts success.
Brogden-Cronbach-Gleser Formula
Used to calculate the dollar amount of a utility gain resulting from the use of a particular selection instrument under specified conditions.
Utility Gain
Refers to an estimate of the benefit (monetary or otherwise) of using a particular test or selection method.
Decision Theory
Recommended to determine Test Utility by Cronbach and Gleser.
To illustrate this, we need to recall five terms: base rate, hit rate, miss rate, false positive, and false negative.
Base Rate
The extent to which a particular trait, behavior, characteristic, or attribute exists in the population (expressed as a proportion).
Hit Rate
The proportion of people a test accurately identifies as possessing or exhibiting a particular trait, behavior, characteristic, or attribute.
Miss Rate
The proportion of people the test fails to identify as having, or not having, a particular characteristic or attribute.
False Positive
A miss wherein the test predicted that the test-taker did possess the characteristic or attribute being measured when in fact the test-taker did not.
False Negative
A miss wherein the test predicted that the test-taker did not possess the characteristic or attribute being measured when the test-taker actually did.
perfect predictors
Tests are often “assumed” to be __________ of future performance.
That is, those who score above the cut-off score are expected to be successful on the job, and those who do not meet the cut-off score are predicted to be unsuccessful.
Decision Theory
____________ provides guidelines for setting optimal cut-off scores.
In certain professions, like airline pilots and surgeons, having false negatives would be preferable than false positives, for obvious reasons.
The Pool of Job Applicants, The Complexity of the Job, The Cut-off Score Used
3 Practical Considerations in Test Utility
The Pool of Job Applicants
Utility estimates assume that there is a steady supply of viable applicants to occupy the positions at stake.
There are some professions with few qualified applicants (or would they accept, even if they are qualified).
The Complexity of the Job
There are disagreements among experts as to whether it is appropriate to use the same utility models to jobs of varying complexities (i.e., a highly complex job may have more stringent standards of successful performance).
The Cut-off Score Used
A (usually numerical) reference point derived as a result of a judgment and used to divide a set of data into two or more classifications, with some action to be taken or some inference to be made on the basis of these classifications.
Can be a Relative Cut Score or a Fixed Cut Score.
Relative Cut Score
A reference point—in a distribution of test scores used to divide a set of data into two or more classifications—that is set based on norm-related considerations rather than on the relationship of test scores to a criterion.
Because this type of _______ is set with reference to the performance of a group (or some target segment of a group), it is also referred to as a norm-referenced cut score.
This is set with reference to the performance of a group (or some target segment of a group).
Fixed Cut Score
A reference point—in a distribution of test scores used to divide a set of data into two or more classifications—that is typically set with reference to a judgment concerning a minimum level of proficiency required to be included in a particular classification.
May also be referred to as absolute cut scores.
Multiple Cut Scores
Refer to the use of two or more cut scores with reference to one predictor for the purpose of categorizing testtakers.
For example, different cut scores are set to be equivalent to ratings of A, B, C. D, etc.
Multiple Hurdles
The achievement of one cut-off score is necessary to proceed to the next stage in the evaluation process.
Angoff Method, Known Groups Method
2 methods of Setting Cut Scores
Angoff Method
A way to set fixed scores that entails averaging the judgments of experts.
It determines how often a minimally qualified performer would answer a test item correctly.
A panel of experts is chosen to review test items and estimate the probability that a minimally qualified performer would answer the item correctly.
This simple technique has wide appeal, and works well—that is, as long as the experts agree.
There is low inter-rater reliability and major disagreements regarding how certain populations of testtakers should respond to items.
Known Groups Method
A method of collecting data on a predictor of interest from groups known to possess (and not to possess) a trait, attribute, or ability of interest.
Known Groups Method
The main problem with using this method is that determining the cutoff score is inherently affected by the composition of the contrasting groups.
Based on data analysis, a cut score is set on the test that best discriminates the two groups’ test performance.
48 items
50 items; 4 sub-sets
40 items each subset
240 items
240 items
240 items
