1/44
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Unlike other intelligence tests, African American people were found to perform at the same level as white people on the BITCH test of intelligence.
False. AAs did better than WP
One group of people scores higher on a test designed to predict job performance than another group of people. Overall, the test is found to be valid. On a scatterplot (of job performance versus test score), the two groups are best modelled with a single regression line. This means the test must be biased, but can still predict job performance to the same degree within each group separately.
False. May be that the test is not biased.
One group of people scores higher on a test designed to predict job performance than another group of people. Overall, the test is found to be valid. On a scatterplot (of job performance versus test score), the two groups are best modelled using two separate but parallel regression lines. This means the test and the measure of job performance are equivalently biased.
False. No evidence that the measure of job performance is biased
If a test is differentially valid, it yields different validity coefficients for different groups of people, indicating it is biased.
True
One group of people scores higher on a test designed to predict job performance than another group of people. Overall, the test is found to be valid. On a scatterplot (of job performance versus test score), the two groups are best modelled with two separate NON-parallel regression lines. This means the test is biased, but can still predict job performance to the same degree within each group separately.
False. Needs to parallel to predict job performance equally
Research on the "Pygmalion effect" (Rosenthal & Jacobsen, 1966) found that if teachers were told that certain students were likely to do well academically (when in reality these students were identified at random), then the selected students showed greater IQ score gains across all age groups.
False. Only for younger kids
Steele and Aronson (1998) ran a study in which African Americans performed better than White Americans at the US Graduate Record Examination, when the test content was changed to be more culturally appropriate to them.
False. They performed better when test did not say it was measuring IQ
In Australian employment law, it is possible to overcome a claim of discrimination on the basis of a disability, if you can demonstrate that the disability is directly relevant to some inherent requirement of the job.
True
When using aptitude and personality tests for employee recruitment in Australia, the tests need to have content validity with regards to the requirements of the job.
True
The current national Australian body dealing with unfair workplace practices is the Fair Work Commission.
True
In US court cases, intelligence tests (such as the WISC and Stanford-Binet) have been consistently judged to be racially biased when used in educational settings.
False
Inconsistencies in court decisions regarding the use of psychological tests are fortunately relatively rare (Kaplan & Saccuzzo, 2018).
False, Quite common
A test can have very good reliability and validity but poor utility.
True
In principle, it is possible for a test to have good utility even if it does not have decent reliability and validity.
True. e.g. lie detectors may deter dishonesty
Expectancy tables involve an analysis of the reliability scatterplot for a test.
False. Should be the validity scatterplot
Expectancy data can be used to evaluate the content validity of a test.
False. Evaluates criterion validity
Imagine I used a psychological test (where a higher score is indicative of better performance) to decide which of 30 applicants to hire for 15 potential job positions. The test has high criterion-referenced validity with respect to job performance. If I set a very high cut-off for the test to minimise false positives, then most of the applicants would fail the test and I would risk ending up with insufficient people to fill the job positions.
True
Imagine I used a psychological test (where a higher score is indicative of better performance) to decide which of 30 applicants to hire for 15 potential job positions. The test has high criterion-referenced validity with respect to job performance. If I set a very low cut-off for the test to minimise false negatives, then most of the applicants would fail the test and I would risk ending up with insufficient people to fill the job positions.
False. Most would pass but there would be a few bad apples
Imagine I used a psychological test (where a higher score is indicative of better performance) to decide which applicants to hire for a potential job. If the test had a zero criterion-referenced validity coefficient with respect to job performance, this would mean that the worst candidates would be likely to perform better at the test than the best candidates.
False. The test would not be predictive at all
Imagine I used a psychological test (where a higher score is indicative of better performance) to decide which applicants to hire for a potential job. If the test had a very high criterion-referenced validity coefficient with respect to job performance, this would mean that we should only get a very small proportion of false negative and false positive outcomes.
True
Imagine I used a psychological test (where a higher score is indicative of better performance) to decide which applicants to hire for some potential job positions. The test has high criterion-referenced validity with respect to job performance. If I set a very high cut-off for the test then this should minimise the number of false negatives, though at the expense of increased false positives.
False
Using an expectancy table is considered a sophisticated state-of-the-art method of utility analysis.
False
When Schmidt et al. (1979) used utility analysis to evaluate the efficacy of the Programmer Aptitude Test for selecting programmers over traditional non-test methods such as interviews, they found the key factor behind their saving of approximately $6 million per year was that the test had much better psychometric validity than the other selection methods.
True
If we discover that one group of people score higher on a test than another group of people, what are the possible underlying reasons for this?
- Test is biased
- Group differences are real
If our test is biased then what are our options for dealing with this?
- Make accommodations
- Redevelop test
- Develop alternative test
- Avoid testing
Describe how we can use regression lines to model test bias. What psychometric property are we actually evaluating here?
We look at the criterion validity scatterplots to see if there are seperate patters between groups
Describe the three different test bias scenarios that could account for differences between groups and the implications of each.
1. Same regression line, different group means: either group differences exist or performance measure is equally biased as test
2. Intercept bias: same slope for each group but seperate intercepts. The test is biased towards a group but still predictive of ability.
3. Slope bias: groups have different slopes (differentially valid). Test is biased towards a group and comparisons can not be made.
What does "intercept bias" mean, in the context of conceptualising test bias?
Each group has parallel slopes but different intercepts
What do "differentially valid" and "slope bias" mean, in the context of conceptualising test bias?
Each group has differing slopes
Some people have argued that there are race differences in intelligence test scores and that these are due to genetics. What are the arguments for and against this idea?
IQ is part genetically determined and so is race, therefore it is theoretically possible that the two are linked (but concrete evidence for this does not exist). Some have observed that adopted black children raised in white families still did worse in IQ, however, these results could be explained by expectancy effects, stereotype threat, self stereotyping, etc. (the environment)
What is the Pygmalion Effect? Describe the experiment that Rosenthal and Jacobson carried out to demonstrate it.
When teachers were told that certain (randomly selected) children were "bloomers", teachers were biased to give these students more attention, which improved their IQ (but only in 1st and 2nd grade)
Describe the experiment that Steele and Aronson carried out to demonstrate the effect of self-stereotyping.
When African Americans were told that a test measured intellectual ability they did worse than African Americans that were told that the test was about problem-solving abilities.
Describe the experiment that Shih et al. carried out to demonstrate the effect of self-stereotyping.
When Asian Americans were primed to think about their race they did better on math test than controls.
When Asian Americans who were primed to think about their gender, they did worse on math test than control.
Discuss the issues that have arisen when the use of psychological testing has been taken to court on the grounds of bias (no need to give details of particular cases).
- Issues of discrimination and test validity/bias
Describe an Australian court case that involved personality tests used for employment.
Coms21 hired a consulting firm to identify who to terminate based on personality measures that were not relevant to the job. Payouts were given to those that were terminated.
What is test utility?
The practical usefulness of a test.
What is utility analysis?
A family of techniques that entail a cost-benefit analysis designed to yield information relevant to a decision about the usefulness and/or practical value of a tool of assessment.
What is an expectancy table and how do you calculate one? What psychometric property is it analysing?
You generate the table from a criterion validity scatterplot (e.g. test score vs job performance). You choose a cut off score and the scores that count for "good" performers. This gives you a quadrant where you can determine correct hits, correct misses, false negatives, and false positives.
What are selection ratios in the context of expectancy tables?
job positions/job applicants
Imagine you are an HR manager. Describe how you could use expectancy table techniques to deal with (1) a high selection ratio and (2) a low selection ratio. What would be the risks you would be running in either scenario?
High: Less people to choose from for the job. So you'd need to set a low cut off point to get enough people. Decrease false negatives, increase false positives. Risks having too many bad performers
Low: Lot's of people to choose from for the jobs. You could set a high cut off point, which would decrease false positives, increase false negatives. Risks not hiring enough people
What influence does the criterion validity of a particular selection test have on its usefulness in recruiting people?
The higher the validity, the less false negatives and false positives (and vice versa)
What is wrong with selecting people for a job based on an unstructured interview?
Little validity. May select people for the wrong reasons e.g. who are more charming but waste time talking and not doing work.
Why might criterion validity be important to people working in human resources?
Hiring poor performers can cost businesses money. Hiring good performers can save money. The higher the validity of the test, the more accurate it will be at discriminating between the two (assuming high reliability too)
What are the disadvantages of using the expectancy table technique?
- Assumes linear relationships
- Does not take into account other factors e.g. minority status, health, etc.
Give an example of how utility analysis has been used to save employers millions of dollars (no need to give figures - just explain the concepts).
The programmer aptitude test has great validity (.76). When used instead of previous non-test methods, test utility analysis found that it saved employers $6M/year