1/49
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Validity Coefficient
Validity refers to the relationship between predictor and criterion scores. Often this relationship is assessed using a correlation (see the measurement chapter). The correlation between predictor and criterion scores is known as a validity coefficient. The usefulness of a predictor is determined on the basis of the practical significance and statistical significance of its validity coefficient. As was noted in the measurement chapter, reliability is a necessary condition for validity. Selection measures with questionable reliability will have questionable validity.
Hiring Success Gain
Hiring success refers to the proportion of new hires who turn out to be successful on the job. Hiring success gain refers to the increase in the proportion of successful new hires that is expected to occur as a result of adding a new predictor to the selection system. This is closely related to incremental prediction, except rather than just describing changes in validity, the results take the applicant pool and job difficulty into account. Thus, gain is influenced not only by the validity of the new predictor but also by the selection ratio and base rate.
Selection Ratio
The selection ratio is the number of people hired divided by the number of applicants (sr = number hired / number of applicants).
Page 576
When the company has a large number of applicants for an opening, the selection ratio is low, and the company can freely pick its most preferred applicants from a large pot. On the other hand, when the selection ratio is high, there are few applicants for openings, and the company needs to hire nearly every applicant.
Compensatory Model
With a compensatory model, scores on one predictor are simply added to scores on another predictor to yield a total score. This means that high scores on one predictor can compensate for low scores on another. For example, if an employer is using an interview and GPA to select a person, an applicant with a low GPA who does well in the interview may still get the job.
The advantage of a compensatory model is that it recognizes that people have multiple talents and that many different constellations of talents may produce success on the job. The disadvantage of a compensatory model is that, at least for some jobs, the level of proficiency for specific talents cannot be compensated for by other proficiencies. For example, a firefighter requires a certain level of strength that cannot be compensated for by intelligence.
In terms of using the compensatory model to make decisions, four procedures may be followed: clinical prediction, unit weighting, rational weighting, and multiple regression. The four methods differ from one another in terms of the manner in which predictor scores are weighted before being added together for a total or composite score.
Exhibit 11.4 illustrates these procedures. Differences in weighting methods are shown in the bottom part of L Exhibit 11.4, and a selection system consisting of interviews, application blanks, and recommendations is shown in the top part. For simplicity, assume that scores on each predictor have been standardized to fit in a range from 1 to 5. Scores on these three predictors are shown for three applicants.
Clinical Prediction.
In the clinical prediction approach in L$ Exhibit 11.4, managers use their expert judgment to arrive at a total score for each applicant. That final score may or may not be a simple addition of the three predictor scores shown in the exhibit. Hence, applicant A may be given a higher total score than applicant B even though simple addition shows that applicant B had one more point
(4 + 3 + 4 = 11) than applicant A (3 + 5 + 2 = 10):
Frequently, clinical prediction is done by initial screening interviewers or hiring managers. These decision makers may or may not
Page 583
have "scores" per se, but they have multiple pieces of information on each applicant, and they make a decision on the applicant by taking everything into account. For example, when making an initial screening decision on an applicant, a manager at a fast-food restaurant might subjectively combine their impressions of various bits of information about the applicant on the application form and a quick interview.
Unit Weighting.
With unit weighting, each predictor is weighted the same at a value of 1.00. As shown in
Exhibit 11.4, the
predictor scores are simply added together to generate a total score. Therefore, the total scores for applicants A, B, and C are 10, 11, and 12, respectively. The advantage of unit weighting is that it is a simple and straightforward process and makes the importance of each predictor explicit to decision makers. The problem with this approach is that it assumes each predictor contributes equally to the prediction of job success, which is often not the case.
Rational Weighting.
With rational weighting, each predictor receives a differential rather than equal weighting. Managers and other subject matter experts (SMEs) establish the weights for each predictor according to the degree to which each is believed to predict job success. These weights (w) are then multiplied by each raw score (P) to yield a total score, as shown in
Exhibit 11.4.
For example, the predictors are weighted .5, .3, and .2 for the interview, application blank, and recommendation, respectively. This means managers think interviews are the most important predictors, followed by application blanks, and then recommendations. Each applicant's raw score is multiplied by the appropriate weight to yield a total score. For example, the total score for applicant A is (5)3 + (3)5 + (2)2 = 3.4=
The advantage of this approach is that it considers the relative importance of each predictor and makes this assessment explicit. The downside, however, is that it is an elaborate procedlyre that requires managers and SMEs to agree on the differential weights to be applied!
Multiple Regression.
Multiple regression is similar to rational weighting in that the predictors receive different weights.
Page 584
With multiple regression, however, the weights are established on the basis of statistical procedures rather than on judgments by managers or other SMEs. The statistical weights are developed from (1) the correlation of each predictor with the criterion, and 2) the correlations among the predictors. As a result, regression weights provide optimal weights in the sense that they will yield the highest total validity.
The calculations result in a multiple regression formula like the one shown in
Exhibit 11.4. A total score for each applicant is obtained
by multiplying the statistical weight (b) for each predictor by the predictor (P) score and summing these along with the intercept value (a).
As an example, assume the statistical weights are .9, .6, and .2 for the interview, application blank, and recommendation, respectively, and that the intercept is .09. Using these values, the total score for applicant A is .09 + (•9)3 + (6)5 + (•2)2 = 6.19:
Multiple regression offers the possibility of a higher degree of precision in the prediction of criterion scores than do the other methods of weighting. Unfortunately, this level of precision is realized only under a certain set of circumstances. In particular, for multiple regression to be more precise than unit weighting, there must be a small number of predictors, low correlations between predictor variables, and a large sample that is similar to the population that will be tested. l' Many selection settings do not meet these criteria, so in these cases consideration should be given either to unit or rational weighting or to alternative regression-based weighting schemes that have been developed, such as general dominance weights or relative importance weights." In situations where these conditions are met, however, multiple regression weights can produce higher validity and utility than the other weighting schemes.
Multiple Regression.
Multiple regression is similar to rational weighting in that the predictors receive different weights
Page 584
With multiple regression, however, the weights are established on the basis of statistical procedures rather than on judgments by managers or other SMEs. The statistical weights are developed from (1) the correlation of each predictor with the criterion, and (2) the correlations among the predictors. As a result, regression weights provide optimal weights in the sense that they will yield the highest total validity.
The calculations result in a multiple regression formula like the one shown in
Exhibit 11.4. A total score for each applicant is obtained
by multiplying the statistical weight (b) for each predictor by the predictor (P) score and summing these along with the intercept value (a).
As an example, assume the statistical weights are .9, .6, and .2 for the interview, application blank, and recommendation, respectively, and
that the intercept is .09. Using these values, the total score for applicant A is .09 + (9)3 + (6)5 + (2)2 = 6.19]
Multiple regression offers the possibility of a higher degree of precision in the prediction of criterion scores than do the other methods of weighting. Unfortunately, this level of precision is realized only under a certain set of circumstances. In particular, for multiple regression to be more precise than unit weighting, there must be a small number of predictors, low correlations between predictor variables, and a large sample that is similar to the population that will be tested. Iº Many selection settings do not meet these criteria, so in these cases consideration should be given either to unit or rational weighting or to alternative regression-based weighting schemes that have been developed, such as general dominance weights or relative importance weights." In situations where these conditions are met, however. multiple regression weights can produce higher validity and utility than the other weighting schemes.
Choosing Among Weighting Schemes. The choice of the best weighting scheme is consequential and likely depends on answers to the most important questions about clinical, unit, rational, and multiple regression schemes:
calentinn decision makers have considerable experience and insight into selection decisions, and is
Choosing Among Weighting Schemes.
The choice of the best weighting scheme is consequential and likely depends on answers to the most important questions about clinical, unit, rational, and multiple regression schemes:
For clinical weighting: Do selection decision makers have considerable experience and insight into selection decisions, and is managerial acceptance of the selection process important?
For unit weighting: Is there reason to believe that each predictor contributes relatively equally to job success?
For rational weighting: Are there differences in importance across predictor areas, and can these be better assessed through judgment rather than through statistical tools?
For regression weighting: Are the conditions under which multiple regression is superior (relatively small number of predictors, low correlations among predictors, and large sample) satisfied?
Answers to these questions will go a long way toward deciding which weighting scheme to use. We should also note that while
Page 585
statistical weighting is more valid than clinical weighting, the combination of both methods may yield the highest validity.
The Angoff method
is a rigorous approach to establishing cut points based on a consensus of SMEs. In this approach, SMEs review the content of the predictor (e.g., test items) and determine the proportion of individuals with a minimum level of
Page 590
competence who would answer each item correctly. For example, experts might estimate that at least 75% of minimally qualified electricians would be able to define the term "ampacity" on a quiz. The experts might estimate that only 25% would be able to identify a bonding jumper. These ratings can then be summed together to obtain a minimal cut score for a compensatory cut score, or evaluated as distinct for a set of conjunctive cut scores. The results of this procedure are dependent on the SMEs. It is very difficult to get members of the organization to agree on who "the" SMEs are. Which SMEs are selected may have a bearing on the actual cut scores developed. There may also be judgmental errors and biases in how cut scores are set. If the Angoff method is used, it is important that SMEs be provided with a common definition of minimally competent test takers and encouraged to discuss their estimates. Each of these steps has been found to increase the reliability of the SME scores I Standards for these minimum scores established through the Angof method can take the form of either a compensatory or a conjunctive model.
Random Selection
With random selection, each finalist has an equal chance of being selected. The only rationale for selecting a person is the "luck of the draw." For example, the eight names from L Exhibit 11.8 could be put in a hat and the finalist drawn out and tendered a job offer. This approach has the advantage of being quick. In addition, with random selection, one cannot be accused of favoritism, because everyone has an equal chance of being selected. The disadvantage of this approach is that discretionary assessments are simply ignored.
Although hiring at random from available candidates is obviously inferior to methods that use selection measures to identify the best candidates, many organizations end up hiring somewhat randomly when they are forced to hire the first acceptable candidate. When the hiring process is continuous, there is never a final list of candidates to choose from. Instead, ongoing needs might require continuously collecting résumés from interested parties, and then when positions open up, calling in everyone who passes the minimum qualifications for open jobs for interviews. This means hiring managers never see a total pool of candidates from which to choose the finalists. Hiring based on the first acceptable candidate is also used when the organization, because of staffing shortages, needs to hire anyone who meets the minimum competency level. Jobs with very high turnover rates, like entry-level retail and food service positions, are typically staffed in this way. While hiring the first acceptable candidate may seem necessary, it is far from an ideal hiring strategy and the costs may not be revealed until it is too late.
Ranking
With ranking, finalists are ordered from the most desirable to the least desirable based on results of discretionary assessments. As shown
1n
Exhibit 11.8, Kaluuya and Shah are tied for first in terms of preferences, whereas Patil is the least preferred. It is important to note that desirability should be viewed in the context of the entire selection process. In this case, persons with lower levels of desirability should not be viewed necessarily as unacceptable. All of the remaining applicants have passed the minimum cut score for qualifications. Job offers are extended to people on the basis of their rank ordering, with the top-ranked person receiving the first offer. Should that person turn down the job offer or withdraw from the selection process, finalist number 2 receives the offer, and so on.
The advantage of ranking is that it indicates the relative worth of each finalist for the job. Using all selection measures is the best
Page 593
way to obtain maximum validity. All the information available on candidates is used in the same way, so the rules are fair and the process is transparent.
Grouping and Banding
For both external hiring and internal promotions, the top-down method will yield the highest validity and utility. This method has been criticized, however, for ignoring, the possibility that small differences between scores are due to measurement error. The top-down method also makes it hard to incorporate other outcomes, like diversity or a sense of cultural fit. With the grouping method, more flexibility in use of managerial discretion is maintained as finalists are banded together into rank-ordered categories. In L Exhibit 11.8, the finalists are grouped according to whethelthey are top choices or on the wait list. The advantage of this method is that it permits ties among finalists, thus avoiding the need to assign a different rank to each person. In other words, grouping means applicants who score within a certain score range or band are considered to have scored equivalently. A simple grouping procedure is provided in l Exhibit 11.8. In this case, a group of top applicants available for first-round selection were identified based on having a total score of 23 or higher. Hiring within bands could then be done based on a variety of other factors, including workforce composition, perceived fit, or likelihood of accepting a job offer. In practice, band widths are usually calculated on the basis of the standard error of measurement.
Differential Weighting
A differential weighting approach to hiring is very similar to the ranking approach but allows for approaches like rational or multiple regression weights for the most important predictors. In L' Exhibit 11.8, this is represented in the column of "weight" scores. The organization decided that experience was relatively unimportant for performance, so it received a weight of 0.I; job knowledge was very important, so it received a weight of 0.5; and interview performance was moderately important, so it received a weight of 0.4. The results of the weighting procedure do change some of the conclusions in this case. Note that Wang, who did exceptionally well on the job knowledge test, moves from fmurth in the raw score ranking to third in the weighted ranking, whereas Kaluuya, who is exceptionally experienced, moves from a tie for first to second place.
Incorporating Diversity
The application of the technique of grouping scores, as described earlier, and then selecting from these groups to foster inclusion of underrepresented groups is called "banding." This should not be confused with group-based scoring (i.e., one set of standards for one demographic group and different standards for another), which is prohibited. Rather, when there are multiple candidates with similar scores, preference is given to those who belong to underrepresented groups. For example, in E
Exhibit 11.8, we might note that the
applicant Nguyen, who is from a group that is underrepresented in the organization, is in the first round (based on qualifications) and micht he nrafarred for calantinn
Research suggests that banding procedures
result in substantial decreases in the disparate impact of cognitive ability tests. The major limitation with banding is that it sacrifices validity. Obviously, taking scores on a 100-point test and lumping applicants into only two groups wastes a great deal of important information on applicants. There is also evidence that typical banding procedures overestimate the width of bands, which of course exacerbates the problem. 20 Organizations considering the use of banding in personnel selection decisions must weigh the pros and cons carefully, including the legal issues. A review of lawsuits concerning banding found that it was generally upheld by the courts. In the end, however, a values choice may need to be made: to optimize validity (to some detriment to diversity) or to optimize diversity (with some sacrifice in validity).
Exhibit 11.9 demonstrates the level of involvement in the decision making process in staffing of each of these parties.

Organizational Leaders
Selection systems can have a huge impact on organizational capabilities and performance, so leaders of the organization,
Page 597
including executives and directors, will have some input into decision making. Leaders have a uniquely valuable, holistic understanding of the purpose of a selection system. Having buy-in from organizational leaders also greatly enhances the success of any policy initiative.
Uniform Guidelines on Employee Selection Procedures
the use of cut scores does not lead to disparate impact in decision making, the UGESP are essentially silent on the issue of cut scores.
The discretion exercised by the organization as it makes its selection decisions is thus unconstrained legally. If disparate impact is occurring, the UGESP become directly applicable to decision making.
Under conditions of disparate impact, the UGESP require the organization to either eliminate its occurrence or justify it by conducting validity studies and carefully setting cut scores:
Where cutoff scores are used, they should normally be set as to be reasonable and consistent with normal expectations of acceptable proficiency within the workforce. Where applicants are ranked on the basis of properly validated selection procedures and those applicants scoring below a higher cutoff score than appropriate in light of such expectations have little or no chance of being selected for employment, the higher cutoff score may be appropriate, but the degree of adverse impact should be considered.
This provision suggests that the organization should be cautious in general about setting cut scores that are above those necessary to achieve acceptable proficiency among those hired. In other words, even with a valid predictor, the organization should be cautious that its hiring standards are not so high that they create needless disparate impact. This is particularly true with ranking systems. Use of random methods-or to a lesser extent, grouping methods-would help overcome this particular objection to ranking systems.
Whatever cut score procedure is used, the UGESP also require that the organization be able to document its establishment and operation.
Specifically, the UGESP say that "if the selection procedure is used with a cutoff score, the user should describe the way in which normal expectations of proficiency within the workforce were determined and the way in which the cutoff score was determined."
VALIDITY COEFFICIENT
Relationship between predictor and criterion scores
Measured through correlations
Practical significance
• Statistical significance
Validity for multiple criteria
Core job tasks
Organization and goal direction
Cooperation and group facilitation
Creativity
WORKFORCE DIVERSITY
Concerns about factors not related to performance on the job:
all credentials may screen low-income applicants
Leadership behaviors that are more typical of men than of women fail to recognize the value of alternative modes of leadership
Difficult issue:
One predictor has high validity and high disparate impact while another predictor has low validity and low disparate impact
Using a variety of different selection tools together
Putting greater weight on lower disparate impact sections of the test
Adding measures with lower disparate impact
CORRELATION WITH OTHER PREDICTORS
Small correlation with other existing predictors is better
Hypothetical example in setting where experience and GPA are already used
Cognitive ability and structured interviews have moderate validity and incremental prediction
Situational judgment has high validity and but low incremental prediction
Means situational judgment (in this case) is very similar to what's already being used, whereas cognitive ability and structured interviews add more
Selection ratio
Number hired divided by number of applicants
High selection ratio means nearly every applicant must be hired
Low selection ratio means the organization can be more selective
Base rate
Number of successful employees divided by number pf employees
Indicates how difficult the job is— higher base rate means easier job
Taylor-Russel Tables
Combine information on selection ratios, base rates, and validity
Conclusion is that selection tools are most valuable when the selection ratio is low, the base rate is low, and validity coefficient is high
TAYLOR-RUSSELL TABLES
What proportion of people can do the job?
If most people can do the job successfully, then having a selection system will offer less benefit.
What proportion of people that apply do we need to take?
If we take most people who apply, then the selection system can't do much for us.
Selection systems are best in situations where fewer people are able to be successful on the job and we are able to sort through many candidates to hire a select few (low selection ratio)

ECONOMIC GAIN

Utility analysis
Data on predictor validity, applicant test scores, and estimated dollar value of performance variability
Expected value of improved job performance if a new selection tool is implemented
Line manager judgments of employee value; expected financial returns for investing in selection
Predictive analytics
Historical information on performance outcomes for business units
Contribution of different characteristics of the workforce to performance outcomes
Existing "hard" data from organizational records for valued outcomes
Kano analysis
Line manager and director descriptions of strategic impact of performance across domains
Changes if economic performance from enhanced levels of different types of employee skills
Tool from marketing, manager judgments regarding critical competencies are incorporated
DETERMINING ASSESSMENT SCORES
Using multiple assessment methods can reduce the deficiency of our measurement giving us a better overall prediction of job performance.
Compensatory model
Adds all scores together into a single number.
Can be done through:
informal "clinical" weighting
unit weighting
rational weighting
regression weighting
Multiple hurdles model
Uses selection tools in order from cheapest to most expensive
Cuts candidates at each stage
Results similar to compensatory model, but costs are much lower
TOLEDO
SELECTING THE BEST WEIGHTING SCHEME
Do decision makers have considerable experience and insight into selection decisions? And Is managerial acceptance of the selection process important? (Clinical Approach)
Is there reason to believe each predictor contributes relatively equally to job success? (Unit Weight)
Are there differences in importance across predictors (Rational
Weighting)
Are conditions under which multiple regression is superior satisfied? (Regression)
CONSEQUENCES OF CUT SCORES
Issue -- What is a passing score?
Score may be:
A single score from a single predictor or
The total score from multiple predictors
The standard we use should be meaningfully related to job performance or other relevant criteria in order to be legally defensible. I
Cut score - Separates applicants who advance from those who are rejected
Often used with multiple hurdles
In general a "cut score" indicates a sufficient score to pass.
METHODS TO DETERMINE CUT SCORES
Minimum competency
Set on the basis of the minimum qualifications deemed necessary to perform the job
Compensatory: single aggregate score across predictors which assumes one characteristic can compensate for another... e.g. a good personality may compensate for lower cognitive ability.
Conjunctive: must pass standards for each predictor. A minimum proficiency is required in each area in order to be considered qualified.
Maximum competency
Screen for "overqualified" candidates
Compensatory
single aggregate score across predictors which assumes one characteristic can compensate for another... e.g. a good personality may compensate for lower cognitive ability.
Conjunctive:
must pass standards for each predictor. A minimum proficiency is required in each area in order to be considered qualified.
CONSEQUENCES OF CUT SCORES
want to avoid a False Positive.
Hiring someone who will not be successful on the job, but scored more highly than the cut off.
We end up with a bad employee.
False Negatives are when we fail to hire a candidate who would have been successful on the job.
• This is unfortunate, but typically not as bad as a False
True Positive and True Negatives mean our system is working well.

BASE RATES
We have a more limited role in determining Base Rate
We can define what we consider an "acceptable" level of performance on the job.
With some jobs, we may have objective performance measures, but this is often a more subjective evaluation.
For now, think of "raising the bar" as lowering the base rate. or setting a higher expected standard of performance as the minimum in order to consider an individual successful.
Higher standard means less people meet that standard.
Ranking
Finalists are ordered from most to least desirable based on results of discretionary assessments
Grouping and banding
Finalists are banded together into rank-ordered categories
Demonstrated impact in increasing diversity in organization
Differential weighting
Incorporating weights on scores for determining final candidate eligibility
A NOTE ON BANDING
We use methods similar to banding in a variety of other situations.
Think about grade ranges A, B, C, D, F. Each of these are effectively a
"band" or "grouping" where scores that are "similar enough" are treated as equivalent.
Statistically, we can calculate a "Standard Error of Measurement" which is based in part on the reliability of the measure being used.
This method acknowledges that our measurements are not perfect. Scores which are within a standard error of one another are functionally treated as equivalent.
Banding allows for consideration of other unmeasured aspects in the decision process when considering applicants who are "equally" likely to succeed in the role.
Organizational leaders
Uniquely valuable, holistic understanding of the purpose of a selection system.
Buy-in enhances the success of any policy initiative
Human Resource Professional
Technical expertise needed to develop sound selection decisions
Access to quantitative information from HR information systems that can be used to quantify predictor-outcome relationships
Line managers
Accountable for the success of the people hired
Identify critical needs in the selection system that might not be addressed
Coworkers
Select members compatible with the goals of the work team
LEGAL ISSUES
Legal issue of importance in decision making
• Cut scores or hiring standards
•Uniform Guidelines on Employee Selection Procedures (UGESP)
If no adverse impact, guidelines are silent on cut scores
If adverse impact occurs, guidelines become applicable
Diversity and hiring decisions
Exclude issues of demography in hiring decisions
Evaluation based on KSAOs relevant to job-related diversity competence