1/105
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
_____________ is the extent to which a score from a selection measure is stable and free from error.
Reliability
_________ The extent to which a score from a test or from an evaluation is consistent and free from error.
Reliability
With the ___________, each one of several people take the same test twice.
test-retest reliability method
_________________ The extent to which repeated administration of the same test will achieve similar results.
Test-retest reliability
______________ The consistency of test scores across time.
Temporal stability
The scores from the first administration of the test are correlated with scores from the second to determine whether they are similar. If they are, then the test is said to have _____________
Temporal stability
__________ refers to the amount of anxiety that an individual normally has all the time.
Trait anxiety
_____________ is the amount of anxiety an individual has at any given moment.
state anxiety
With the _____________ two forms of the same test are constructed.
alternate-forms reliability method
_________ The extent to which two forms of the same test are similar
Alternate-forms reliability
This ____________ of test-taking order is designed to eliminate any effects that taking one form of the test first may have on scores on the second form.
counterbalancing
__________ A method of controlling for order effects by giving half of a sample Test A first, followed by Test B, and giving the other half of the sample Test B first, followed by Test A.
Counterbalancing
The scores on the two forms are then correlated to determine whether they
are similar. If they are, the test is said to have _____________.
form stability
____________ The extent to which the scores on two forms of a test are similar.
Form stability
___________ The extent to which responses to the same test items are consistent.
Item stability
The extent to which similar items are answered in similar ways is referred to as __________ and measures item stability.
internal consistency
The extent to which similar items are answered in similar ways is referred to as internal consistency and measures __________.
item stability
Another factor that can affect the internal reliability of a test is item __________
homogeneity
_____________ The extent to which test items measure the same construct.
Item homogeneity
When reading information about internal consistency in a journal article or a test manual, you will encounter three terms that refer to the method used to determine internal consistency:
split-half, coefficient alpha, and Kuder-Richardson formula 20 (K-R 20).
__________ A statistic used to determine internal reliability of tests that use items with dichotomous answers (yes/no, true/false
Kuder-Richardson Formula 20 (K-R 20)
The ________ is the easiest to use, as items on a test are split into two groups. Usually, all of the odd-numbered items are in one group and all the even numbered items are in the other group.
split-half method
___________ A form of internal reliability in which the consistency of item responses is determined by comparing scores on half of the items with scores
on the other half of the items.
Split-half method
Because the number of items in the test has been reduced, researchers have to use a formula called the _____ formula to adjust the correlation.
Spearman-Brown prophecy
__________ formula Used to correct reliability coefficients resulting from the split-half method.
Spearman-Brown prophecy
_______ (Cronbach, 1951) and the K-R 20 (Kuder & Richardson, 1937) are more popular and accurate methods of determining internal reliability, although they are more complicated to compute.
Cronbach's coefficient alpha
___________ A statistic used to determine internal reliability of tests that use interval or ratio scales.
Coefficient alpha
A fourth way of assessing reliability is __________. A test or inventory can have homogeneous items and yield heterogeneous scores and still not be reliable if the person scoring the test makes mistakes.
scorer reliability.
_______ The extent to which two people scoring a test agree on the test score, or the extent to which a test is scored correctly
Scorer reliability
When human judgment of performance is involved, scorer reliability is discussed in terms of ___________
interrater reliability.
The ____________ for a test can be obtained from your own data, the test manual, journal articles using the test, or test compendia that will be discussed later in the chapter.
reliability coefficient
_______ is the degree to which inferences from scores on tests or assessments are justified by the evidence. As with reliability, a test must be ______ to be useful.
Validity and valid
___________ The degree to which inferences from test scores are justified by the evidence.
Validity
One way to determine a test's validity is to look at its degree of _______ — the extent to which test items sample the content that they are supposed to measure.
content validity
But the personality inventory is very difficult to read (e.g., containing such
words as__________, _________, _________, ) and most of our applicants are only high school graduates.
meticulous, extraverted, gregarious
Another measure of validity is _____________, which refers to the extent to which a test score is statistically related to some measure of job performance called a criterion.
criterion validity
___________ The extent to which a test score is related to some measure of job performance.
Criterion validity
Another measure of validity is criterion validity, which refers to the extent to which a test score is statistically related to some measure of job performance called a _____________.
criterion
________ A measure of job performance, such as attendance, productivity, or a supervisor rating.
Criterion
Criterion validity is established using one of two research designs: ________ or _______
concurrent or predictive
With a _________, a test is given to a group of employees who are already on the job. The scores on the test are then correlated with a measure of the employees' current performance.
concurrent validity design
__________ A form of criterion validity that correlates test scores with measures of job performance for employees currently working for an organization.
Concurrent validity
With a___________, the test is administered to a group of job applicants who are going to be hired.
predictive validity design
____________ A form of criterion validity in which test scores of applicants are compared at a later date with a measure of job performance.
Predictive validity
Thus, the __________ of performance scores makes obtaining a significant validity coefficient more difficult.
restricted range
________ A narrow range of performance scores that makes it difficult to obtain a significant validity coefficient.
Restricted range
A major issue concerning the criterion validity of tests focuses on a concept
known as ______________ —the extent to which a test found valid for a job in one location is valid for the same job in a different location.
validity generalization (VG)
_____________ The extent to which inferences from test scores from one organization can be applied to another organization.
Validity generalization (VG)
A technique related to validity generalization is _____________ validity is based on the assumption that tests that predict a particular component (e.g., customer service) of one job (e.g., a call center for a bank) should predict performance on the same job component for a different job (e.g., a receptionist at a law office).
synthetic validity
__________ A form of validity generalization in which validity is inferred on the basis of a match between job components and tests previously found
valid for those job components.
Synthetic validity
_________ is the most theoretical of the validity types. Basically, it is defined as the extent to which a test actually measures the construct that it purports to measure
Construct validity
____________ is concerned with inferences about test scores
Construct validity
______ which is concerned with inferences about test construction
content validity
Another method of measuring construct validity is _____________ (Hattie & Cooksey, 1984). This method is not common and should be used only when other methods for measuring construct validity are not practical.
known-group validity
__________ A form of validity in which test scores from two contrasting groups "known" to differ on a construct are compared.
Known-group validity
______ is the extent to which a test appears to be job related.
Face validity
________ The extent to which a test appears to be valid.
Face validity
If you also have read a personality description based on a different astrological sign, you probably found it to be as accurate as the one based on your own sign. Why is this? Because of something called __________ (Dickson & Kelly, 1985)—statements so general that they can be true of almost everyone. For example, if I described you as "sometimes being sad, sometimes being successful, and at times not getting along with your best friend," I would probably be very accurate.
Barnum statements
____________ Statements, such as those used in astrological forecasts, that are so general that they can be true of almost anyone.
Barnum statements
Perhaps the most common source of test information is the __________ (Carlson, Geisinger, & Jonson, 2014), which contains information on over 2,700 psychological tests as well as reviews by test experts. Your library probably has online access to the MMY.
Nineteenth Mental Measurements Yearbook (MMY)
________________ A book containing information about the reliability and validity of various psychological tests.
Mental Measurements Yearbook (MMY)
___________ A type of test taken on a computer in which the computer adapts the difficulty level of questions asked to the test taker's success in answering previous questions.
Computer-adaptive testing (CAT)
An increasingly common use of computer testing is ________________
computer-adaptive testing (CAT).
___________ A type of test taken on a computer in which the computer adapts the difficulty level of questions asked to the test taker's success in answering previous questions.
Computer-adaptive testing (CAT)
__________ (Taylor & Russell, 1939) are designed to estimate the percentage of future employees who will be successful on the job if an organization uses a particular test.
Taylor-Russell tables
______________ A series of tables based on the selection ratio, base rate, and test validity that yield information about the percentage of future employees who will be successful if a particular test is used.
Taylor-Russell tables
The second piece of information that must be obtained is the _______
which is simply the percentage of people an organization must hire.
selection ratio
____________ The percentage of applicants an organization hires
Selection ratio
Formula:
Selection ratio = number hired/number of applicants
The final piece of information needed is the ________ of current performance - the percentage of employees currently on the job who are considered successful.
base rate
Determining the ___________ is easier to do but less accurate than the Taylor-Russell tables. The only information needed to determine the proportion of correct decisions is employee test scores and the scores on the criterion.
proportion of correct decisions
_________ A utility method that compares the percentage of times a selection decision was accurate with the percentage of successful employees.
Proportion of correct decisions
The _______ (Lawshe, Bolda, Brune, & Auclair, 1958) were created to do just that. To use these tables, three pieces of information are needed. The validity coefficient and the base rate are found in the same way as for the Taylor Russell tables.
Lawshe tables
_________ that use the base rate, test validity, and applicant percentile on a test to determine the probability of future success for that applicant.
Lawshe tables Tables
Fortunately, I/O psychologists have devised a fairly simple ________ to estimate, the monetary savings to an organization.
utility formula
___________ of ascertaining the extent to which an organization will benefit from the use of a particular selection system.
Utility formula Method
The number of years of ________ for each employee is then summed and divided by the total number of employees.
tenure
__________ The length of time an employee has been with an organization.
Tenure
______________ refers to technical aspects of a test. A test is considered to have measurement bias if there are group differences (e.g., sex, race, or age) in test scores that are unrelated to the construct being measured.
Measurement bias
__________ Group differences in test scores that are unrelated to the construct being measured.
Measurement bias
However, from a legal perspective, if differences in test scores result in
one group (e.g., men) being selected at a significantly higher rate than another (e.g., women), _____________ is said to have occurred and the burden is on the organization using the test to prove that the test is valid
adverse impact
_____________ An employment practice that results in members of a protected class being negatively affected at a higher rate than members of the majority class. Adverse impact is usually determined by the four fifths rule.
Adverse impact
____________ refers to situations in which the predicted level of job success falsely favors one group (e.g., men) over another (e.g., women).
Predictive bias
_______________- A situation in which the predicted level of job success falsely favors one group over another.
Predictive bias
One form of predictive bias is ____________ meaning that the test will
significantly predict performance for one group and not others.
single-group validity,
___________ The characteristic of a test that significantly predicts a criterion for one class of people but not for another.
Single-group validity
A second form of predictive bias is ______________. With _________, a test is valid for two groups but more valid for one than for the other. Single-group validity and differential validity are easily confused, but there is a big difference between the two.
differential validity.
Another important aspect of test fairness is the ___________ held by the applicants taking the test.
perception of fairness
Usually, this is done by a statistical procedure known as _____________
with each test score weighted according to how well it predicts the criterion.
multiple regression,
___________ A statistical procedure in which the scores from more than one criterion-valid test are weighted according to how well each test score predicts the criterion.
Multiple regression
Linear approaches to hiring usually take one of four forms: _________________________
unadjusted top-down selection, rules of three, passing scores, or banding.
With ____________, applicants are rank-ordered on the basis of their test scores.
top-down selection
__________ Selecting applicants in straight rank order of their test scores.
Top-down selection
In a __________ to top-down selection, the assumption is that if multiple test scores are used, the relationship between a low score on one test can be compensated for by a high score on another.
compensatory approach
___________ A method of making selection decisions in which a high score on one test can compensate for a low score on another test. For example, a high GPA might compensate for a low GRE score.
Compensatory approach
A technique often used in the public sector is the __________ (or rule of five), in which the names of the top three scorers are given to the person making the hiring decision (e.g., police chief, HR director).
rule of three
_________ are a means for reducing adverse impact and increasing flexibility. With this system, an organization determines the lowest score on a test that is associated with acceptable performance on the job.
Passing scores
_____________ The minimum test score that an applicant must achieve to be considered for hire.
Passing score
If there is more than one test for which we have passing scores, a decision must be made regarding the use of a ___________ or _______________
multiple-cutoff approach or a multiple-hurdle approach.
______________ A selection strategy in which applicants must meet or exceed the passing score on more than one selection test.
Multiple-cutoff approach