1/46
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
utility
usefulness or practical value of testing to improve efficiency.
to refer to the usefulness or practical value of a training program or intervention.
psychometric soundness
costs
benefits
factors that affect a test’s utility
reliability and validity
By psychometric soundness, we refer—as you probably know by now—to the — and — of a test
reliability
index of — can tell us something about how consistently a test measures what it measures;
validity
index of — can tell us something about whether a test measures what it purports to measure.
index of utility
— can tell us something about the practical value of the information derived from scores on the test.
Test scores are said to have — if their use in a particular situation helps us to make better decisions—better
reliability
it was noted that — sets a ceiling on validity
validity
It is tempting to draw the conclusion that a comparable relationship exists between — and utility
and conclude that “— sets a ceiling on utility.”
After all, a test must be — to be useful
criterion-related validity
Generally speaking, the higher the — of test scores for making a particular decision, the higher the utility of the test is likely to be
costs
refers to disadvantages, losses, or expenses in both economic and noneconomic terms
in the traditional, economic sense; that is, relating to expenditures associated with testing or not testing. If testing is to be conducted, then it may be necessary to allocate funds to purchase (1) a particular test, (2) a supply of blank test protocols, and (3) computerized test processing, scoring, and interpretation from the test publisher or some independent service.
benefit
refers to profits, gains, or advantages
From an economic perspective, the cost of administering tests can be minuscule when compared to the economic — or financial returns in dollars and cents—that a successful testing program can yield.
noneconomic — to be derived from thoughtfully designed and well-run testing programs. ■ an increase in the quality of workers’ performance; ■ an increase in the quantity of workers’ performance;
utility analysis
may be broadly defined as a family of techniques that entail a cost–benefit analysis designed to yield information relevant to a decision about the usefulness and/or practical value of a tool of assessment.
is an umbrella term covering various possible methods, each requiring various kinds of data to be inputted and yielding various kinds of output
expectancy data
decision theory
two general approaches to utility analysis
expectancy table
An — can provide an indication of the likelihood that a testtaker will score within some interval of scores on a criterion measure—an interval that may be categorized as “passing,” “acceptable,” or “failing.”
might indicate, for example, that the higher a worker’s score is on this new test, the greater the probability that the worker will be judged successful.
hit
A correct classification
miss
An incorrect classification; a mistake
hit rate
The proportion of people that an assessment tool accurately identifies as possessing or exhibiting a particular trait, ability, behavior, or attribute
miss rate
The proportion of people that an assessment tool inaccurately identifies as possessing or exhibiting a particular trait, ability, behavior, or attribute
false positive
A specific type of miss whereby an assessment tool falsely indicates that the testtaker possesses or exhibits a particular trait, ability, behavior, or attribute
false negative
A specific type of miss whereby an assessment tool falsely indicates that the testtaker does not possess or exhibit a particular trait, ability, behavior, or attribute
Taylor-Russell tables
— provide an estimate of the extent to which inclusion of a particular test in the selection system will improve selection.
the tables provide an estimate of the percentage of employees hired by the use of a particular test who will be successful at their jobs
determining the increase over current procedures
test’s validity
selection ratio used
base rate
Taylor-Russell tables provide an estimate of the percentage of employees hired by the use of a particular test who will be successful at their jobs, given different combinations of three variables:
test’s validity
The value assigned for the — is the computed validity coefficient.
selection ratio
— is a numerical value that reflects the relationship between the number of people to be hired and the number of people available to be hired.
For instance, if there are 50 positions and 100 applicants, then the— is 50/100, or .50.
base rate
— refers to the percentage of people hired under the existing system for a particular position.
If, for example, a firm employs 25 computer programmers and 20 are considered successful, the — would be .80.
Naylor-Shine tables
— entails obtaining the difference between the means of the selected and unselected groups to derive an index of what the test (or some other tool of assessment) is adding to already established procedures.
determining the increase in average score on some criterion measure
Brogden-Cronbach-Gleser formula
— , used to calculate the dollar amount of a utility gain resulting from the use of a particular selection instrument under specified conditions.
utility gain
refers to an estimate of the benefit (monetary or otherwise) of using a particular test or selection method.
productivity gain
a modification of the BCG formula exists for researchers who prefer their findings in terms of — rather than financial ones.
Here,— refers to an estimated increase in work output.
Cronbach and Gleser’s Psychological Tests and Personnel Decisions
the most oft-cited application of statistical decision theory to the field of psychological testing is— and —
Cronbach and Gleser (1965)
— presented (1) a classification of decision problems; (2) various selection strategies ranging from single-stage processes to sequential analyses; (3) a quantitative analysis of the relationship between test utility, the selection ratio, cost of the testing program, and expected value of the outcome; and (4) a recommendation that in some instances job requirements be tailored to the applicant’s ability instead of the other way around (a concept they refer to as adaptive treatment).
pool of job applicants
complexity of the job
cut sc
Some Practical Considerations when conducting utility analysis
cut score
— as a (usually numerical) reference point derived as a result of a judgment and used to divide a set of data into two or more classifications, with some action to be taken or some inference to be made on the basis of these classifications
relative cut score
A — may be defined as a reference point based on norm-related considerations rather than on the relationship of test scores to a criterion.
referred to as a norm-referenced cut score.
As an example of a —, the top 10% of all scores on each test would receive the grade of A. In other words, the cut score in use would depend on the performance of the class as a whole.
fixed cut score
— which we may define as a reference point that is typically set with reference to a judgment concerning a minimum level of proficiency required to be included in a particular classification.
also be referred to as absolute cut scores.
An example of a — might be the score achieved on the road test for a driver’s license. Here the performance of other would-be drivers has no bearing upon whether an individual testtaker is classified as “licensed” or “not licensed.”
multiple cut scores
— refers to the use of two or more cut scores with reference to one predictor for the purpose of categorizing testtakers.
So, for example, your instructor may have multiple cut scores in place every time an examination is administered, and each class member will be assigned to one category (e.g., A, B, C, D, or F) on the basis of scores on that examination. That is, meeting or exceeding one cut score will result in an A for the examination, meeting or exceeding another cut score will result in a B for the examination, and so forth.
multiple hurdle
— , a cut score is in place for each predictor used.
The cut score used for each predictor will be designed to ensure that each applicant possess some minimum level of a specific attribute or skill.
In this context,— in which the achievement of a particular cut score on one test is necessary in order to advance to the next stage of evaluation in the selection process
compensatory model of selection
— , an assumption is made that high scores on one attribute can, in fact, “balance out” or compensate for low scores on another attribute.
According to this model, a person strong in some areas and weak in others can perform as successfully in a position as a person with moderate abilities in all areas relevant to the position in question
William Angoff
Angoff method was devised by
Angoff method
— for setting fixed cut scores can be applied to personnel selection tasks as well as to questions regarding the presence or absence of a particular trait, attribute, or ability.
When used for purposes of personnel selection, experts in the area provide estimates regarding how testtakers who have at least minimal competence for the position should answer test items correctly.
known groups method
Also referred to as the method of contrasting groups
— entails collection of data on the predictor of interest from groups known to possess, and not to possess, a trait, attribute, or ability of interest.
Based on an analysis of this data, a cut score is set on the test that best discriminates the two groups’ test performance.
IRT Based Methods
In this theory, cut scores are typically set based on tessttakers’ performance across all the items on the test;
some portion of the total number of items on the test must be scored “correct” (or in a way that indicates the testtaker possesses the target trait or attribute) in order for the testtaker to “pass” the test (or be deemed to possess the targeted trait or attribute).
item-mapping method
For example, a technique that has found application in setting cut scores for licensing examinations is the—
It entails the arrangement of items in a histogram, with each column in the histogram containing items deemed to be of equivalent value.
Judges who have been trained regarding minimal competence required for licensure are presented with sample items from each column and are asked whether or not a minimally competent licensed individual would answer those items correctly about half the time.
bookmark method
An IRT-based method of setting cut scores that is more typically used in academic applications is the —
Use of this method begins with the training of experts with regard to the minimal knowledge, skills, and/or abilities that testtakers should possess in order to “pass.”
Subsequent to this training, the experts are given a book of items, with one item printed per page, such that items are arranged in an ascending order of difficulty.
The expert then places a “bookmark” between the two pages (or, the two items) that are deemed to separate testtakers who have acquired the minimal knowledge, skills, and/or abilities from those who have not.
R. L. Thorndike
In his book Personnel Psychology, —(1949) proposed a norm-referenced method for setting cut scores called the method of predictive yield
method of predictive yield
— was a technique for setting cut scores which took into account the number of positions to be filled, projections regarding the likelihood of offer acceptance, and the distribution of applicant scores.
discriminant analysis (also referred to as discriminant function analysis).
Another approach to setting cut scores employs a family of statistical techniques called —
These techniques are typically used to shed light on the relationship between identified variables (such as scores on a battery of tests) and two (and in some cases more) naturally occurring groups (such as persons judged to be successful at a job and persons judged unsuccessful at a job).