1/47
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Which statement is TRUE concerning a coefficient of correlation?
A. A correlation coefficient is an index of the causal relationship between two variables.
B. A correlation coefficient may be useful in prediction.
C. It covaries with the standard deviation.
D. It came about as a result of someone asking Francis Galton what his "sign" was.
B. A correlation coefficient may be useful in prediction.
To calculate a Pearson r using one of the formulas presented in the text, it is necessary to know
A. the standard scores for both variables.
B. the standard score for only one variable.
C. percentiles for both variables.
D. raw scores for each variable.
D. raw scores for each variable.
A correlation coefficient that is significant at the p < .01 level
A. has a 99% chance of being accurate.
B. could have been expected to occur by chance alone one time or less in 100.
C. could have been expected to occur by chance alone 99 times or more in 100.
D. accounts for about 1% of the variance.
B. could have been expected to occur by chance alone one time or less in 100.
Which of the following is most directly associated with the process of predicting scores using regression techniques?
A. a standard error of the estimate
B. a standard error of the mean
C. a standard error of measurement
D. a standard error of the difference
A. a standard error of the estimate
According to the code of ethics, which of the following is true when it comes to disclosing information?
A. we disclose information only when the client provides permission to do so
B. we disclose information to the source of referral even without the consent of the client
C. when people need to be protected from harm
D. all of the above
C. when people need to be protected from harm
Jovie is a newly-hired psychometrician in a company. Just before her scheduled employee testing, her boss spoke to her and asked if she could finish the assessment, which normally takes 2 hours, within 30 minutes justifying that the testing procedures are just formality and the employee would be accepted no matter what. What should Jovie do?
a. Finish the assessment within 30 minutes, as apparently the test would not be used as a basis for hiring selection
b. Compromise with the boss to give her at least 30 minutes more
c. Agree, but inform the applicant about the change
d. Do not agree and explain the testing procedures to the boss
d. Do not agree and explain the testing procedures to the boss
During an emergency and an immediate mental health service is required which is beyond our competence, we should
a. Give the necessary service even when we lack proper training
b. Do not give the service as it is beyond our competence
c. Address the situation but avoid actually handling the case
d. Wait until the competent services are available before helping with the case
d. Wait until the competent services are available before helping with the case
Which of the following is NOT an acceptable way to divide a test when using the split-half reliability method?
A. Randomly assign items to each half of the test.
B. Assign odd-numbered items to one half and even-numbered items to the other half of the test.
C. Assign the first-half of the items to one half of the test and the second half of the items to the other half of the test.
D. Assign easy items
D. Assign easy items
A guidance counselor wishes to determine if a student scored higher on a mathematics test than on a reading test. What statistic(s) would be MOST useful?
A. the standard error of measurement for each test score
B. the standard error of the difference between two scores
C. the raw score on each test as well as the mean of each distribution
D. the mean of each distribution and index of test difficulty for each test
B. the standard error of the difference between two scores
Which assessment technique is the BEST example of a face valid method?
A. a personality test in which testtakers are asked to describe what they see in inkblots
B. administering a word processing test to a person applying to be a word processor
C. asking testtakers to draw a picture of their family to assess family relationships
D. measuring the height of applicants applying for a semi-pro basketball team
B. administering a word processing test to a person applying to be a word processor
An investigation of a test's construct validity may yield evidence that
A. the test is measuring a single construct.
B. the test does not correlate significantly with another test purporting to measure the same construct.
C. test scores increase as a function of age.
D. All of these
D. All of these
If you were a psychologist working in the field of human resources, which claim for a new personnel selection test by a test publisher would be MOST compelling and persuasive?
A. The test identifies a large number of false positives.
B. The test improves the hit rate.
C. The test identifies a large base rate.
D. The test improves the selection ratio.
B. The test improves the hit rate.
All validity evidence can be interpreted as ________ validity.
A. content
B. criterion-related
C. predictive
D. Construct
D. construct
If a test is a valid measure of a particular construct, we would expect that
A. groups of people who differ with respect to the construct will obtain different test scores.
B. groups of people who differ with respect to the construct will obtain similar test scores.
C. groups of people who obtain similar scores will have similar personalities.
D. None of these
A. groups of people who differ with respect to the construct will obtain different test scores.
A statistically insignificant correlation exists between scores on a new test of depression and a well-established measure of satisfaction with life. These data may be construed as which type of validity evidence with regard to the test of depression?
A. criterion-related validity
B. convergent evidence of construct validity
C. discriminant evidence of construct validity
D. None of these because there was an insignificant relationship.
C. discriminant evidence of construct validity
The names attributed to different factor loadings in a factor analysis are
A. dictated by the factors themselves.
B. subject to change as new analyses occur.
C. thoroughly validated against dictionary definitions.
D. typically dependent on the researcher's judgment.
D. typically dependent on the researcher's judgment.
In the context of test bias, a biased test
A. may be used fairly.
B. may be used unfairly.
C. may be used either fairly or unfairly.
D. is only used by biased test users.
C. may be used either fairly or unfairly.
Which of the following is TRUE of test bias as compared to test fairness?
A. Test bias is dependent on statistical analyses while test fairness relates to values.
B. Test bias is dependent on values while test fairness relates to statistical analyses.
C. Whether a test is fair can be answered with certainty while whether a test is biased cannot.
D. None of these statements are true.
A. Test bias is dependent on statistical analyses while test fairness relates to values.
In psychological testing and assessment, bias BEST refers to
A. random variation in test performance attributable to covert prejudice on the part of the test developer.
B. systematic variation in test performance that is unrelated to the construct that the test is intended to be measured.
C. a test or testing practice that systematically favors the performance of one group of testtakers over another.
D. All of these
D. All of these
Which of the following is the BEST way to minimize test bias?
A. create separate norm groups for different groups so that any potential bias is reduced.
B. have a panel of experts review the test items at various stages during the test's development.
C. pre-screen examiners to be used in the test administration for any signs of bias or prejudice.
D. employ the multitrait-multimethod matrix to screen items for bias.
B. have a panel of experts review the test items at various stages during the test's development.
The results of a predictive validity study of a test will likely be affected most by
A. the characteristics of the sample tested, such as attrition and self-selection.
B. the number of items on the test, with longer tests demonstrating higher predictive validity.
C. the correlation coefficient chosen to measure the validity.
D. the administration time required for the test compared with that of the criterion test chosen.
A. the characteristics of the sample tested, such as attrition and self-selection.
Which is an example of convergent evidence for the construct validity of a test measuring fear of cats?
A. a high correlation between the test and an existing validated test measuring fear of cats
B. a high correlation with an existing validated test measuring more-generalized fear
C. a low correlation between the test and a test to measure fear of dogs
D. Both a high correlation between the test and an existing validated test measuring fear of cats and a high correlation with an existing validated testmeasuring more-generalized fear
D. Both a high correlation between the test and an existing validated test measuring fear of cats and a high correlation with an existing validated testmeasuring more-generalized fear
The extent to which a particular factor contributes to a test score is referred to as a
A. true score.
B. base rate.
C. factor loading.
D. hit rate.
C. factor loading.
As the term is applied to a test, validity is a judgment or estimate of how well a test
A. measures what it purports to measure.
B. measures what it purports to measure in a particular context.
C. satisfies the deductions that could logically be made from inferences about it.
D. a test result can be duplicated under the same or similar circumstances.
A. measures what it purports to measure.
Each of the three approaches to validity assessment in the trinitarian model should BEST be thought of as
A. mutually exclusive as evidence of a test's validity with any one source necessary and sufficient for demonstrating a test's validity.
B. one type of evidence that, with others, contributes to a judgment concerning the validity of a test.
C. insufficient, either by themselves or together with the other two, to demonstrate the validity of a test.
D. None of these
B. one type of evidence that, with others, contributes to a judgment concerning the validity of a test.
Which statement best describes the relationship between item difficulty and a "good" item?
A. The difficulty level is not a factor in determining a "good" item.
B. An item with a high difficulty level is likely to be "good."
C. An item with a mid-range difficulty level is likely to be "good."
D. An item with a low difficulty level is likely to be "good."
C. An item with a mid-range difficulty level is likely to be "good."
The item-validity index is key in determining
A. construct validity.
B. criterion-related validity.
C. content validity.
D. All of these
B. criterion-related validity.
The greater the magnitude of the item-discrimination index, the more testtakers in the higher-scoring group answered the item correctly, as compared to testtakers
A. who served as the non-test-taking control group.
B. in the lower-scoring group.
C. who participated in the test standardization.
D. None of these
B. in the lower-scoring group.
An item-characteristic curve includes all of the following EXCEPT
A. information that can be used to judge item bias.
B. information that can be used to judge item fairness.
C. item-discrimination information.
D. item-difficulty information.
B. information that can be used to judge item fairness.
Which is true of cross-validation of a test after standardization has occurred?
A) Cross-validation creates confusion regarding the meaning of the original standardization data.
B) The cross-validation sample is composed of the same test takers that participated in the original test standardization.
C) Cross-validation often results in validity shrinkage.
D) All of the answers are correct.
C) Cross-validation often results in validity shrinkage.
Which is true of cross-validation of a test after standardization has occurred?
A) Cross-validation creates confusion regarding the meaning of the original standardization data.
B) The cross-validation sample is composed of the same test takers that participated in the original test standardization.
C) Cross-validation often results in validity shrinkage.
D) All of the answers are correct.
C) Cross-validation often results in validity shrinkage.
Which is NOT a typical question that is raised and answered during the test conceptualization stage of test development?
A. What is the objective of the test?
B. Is there a need for the test?
C. How valid are the items on the test?
D. What types of responses will be required of the testtaker?
C. How valid are the items on the test?
Which is a major difference between comparative scaling and categorical scaling?
A) Comparative scaling involves sorting stimuli; categorical scaling does not.
B) Comparative scaling involves making quantitative judgments; categorical scaling does not.
C) Comparative scaling involves putting stimulus cards in a set number of different piles assigned a certain meaning; categorical scaling does not.
D) Comparative scaling involves rank-ordering each stimulus individually against every other stimulus; categorical scaling does not.
D) Comparative scaling involves rank-ordering each stimulus individually against every other stimulus; categorical scaling does not.
183.An individually administered designed for use with elementary-school-age student is in the test tryout stage of test development. For the purposes of the tryout, this test should be administered
A. as a group test to as many classes as possible in an elementary school.
B. individually to high school students for exploratory purposes.
C. individually to elementary-school-age students in an environment that simulates the way that the final version of the test will be administered.
D. to experts in elementary school education to ensure that the items are appropriate for elementary school-aged children.
C. individually to elementary-school-age students in an environment that simulates the way that the final version of the test will be administered.
184.A "good" test item on an ability test is one
A) to which almost all test takers respond correctly.
B) that distinguishes high scorers from low scorers.
C) to which almost all test takers respond incorrectly.
D) in which it is absolutely impossible to guess the correct answer.
B) that distinguishes high scorers from low scorers.
185.The higher an item-validity index, the greater the __________validity.
A.construct
B.content
C. criterion
D. Face
C. criterion
A sensitivity review typically focuses on which of the following?
A. individual test items
B. the standardization sample
C. statistics used as part of validity and reliability studies
D. the extent to which latent traits are latent
A. individual test items
On an item characteristic curve, the steeper the curve,
A) the more latent the trait is presumed to be.
B) the greater the item reliability.
C) the less the item discrimination.
D) the greater the item discrimination.
D) the greater the item discrimination.
Possible applications of IRT were discussed in your textbook. Which of the following is not one of those possible applications?
A) determining measurement equivalence across test taker populations
B) identifying a common metric among several tests measuring the same construct
C) evaluating existing tests for the purpose of mapping test revisions
D) developing item banks
C) evaluating existing tests for the purpose of mapping test revisions
When a test is translated from one language in one culture to another language in another culture, ______ can help ensure that the original test and the translated test are reasonably equivalent and tapping the same construct.
A. a translator
B. IRT
C. bi-lingual people who are experts on the two cultures
D. All of these
D. All of these
The development of a criterion-referenced test usually entails
A) exploratory work with a group of test takers who have mastered the material.
B) exploratory work with a group of test takers who have not mastered the material.
C) Both exploratory work with a group of test takers who have mastered the material and exploratory work with a group of test takers who have not mastered the material are correct.
D) None of the answers is correct.
C) Both exploratory work with a group of test takers who have mastered the material and exploratory work with a group of test takers who have not mastered the material are correct.
Ariel et al. (2015) found that body cameras worn by police have
utility in reducing use-of-force incidents, as well as use-of-force complaints by citizens. However, given the procedures used in their
study, questions remain regarding whether changes in the participants' behavior was more a function of the camera or
A. the police officer's verbal warning.
B. the ten directives in the experimental protocol.
C. officers attempting to give citizens two or more chances to comply with
commands.
D. All of these
B. the ten directives in the experimental protocol
A problem with using the known group method of setting cut
scores is that
A. there is no consistent method of obtaining contrasting groups.
B. strong deterrents to test user acceptance of the data are in place.
C. no standards are in place for choosing contrasting groups.
D. test users must be personally familiar with each member in the known
group.
A. there is no consistent method of obtaining contrasting groups.
When a cut score is set based on norm-related considerations
rather than on the relationship of test scores to a criterion, the cut
score is referred to as
A. a relative cut score.
B. a fixed cut score.
C. an absolute cut score.
D. a referential cut score.
A. a relative cut score.
The term item-mapping refers to an IRT-based method of
A. setting cut scores that entails expert judgments based, in part, upon how
culturally fair items are deemed to be.
B. setting cut scores that entails the use of experts rearranging items placed
on maps by level of difficulty.
C. setting cut scores that entails a histographic representation of test items.
D. test construction that was first used for a high school geography
achievement test.
B. setting cut scores that entails the use of experts rearranging items placed
on maps by level of difficulty.
206.Which of the following is NOT an assumption of utility analysis?
A. the value of people and their performance can be estimated.
B. psychological tests are always preferred over other means of assessment.
C. the performance of people in organizations can affect organizational
viability.
D. large amounts of information can be integrated to make good decisions.
B. psychological tests are always preferred over other means of assessment.
"An empirical standard used to divide a group of data into two or more distinct categories" is a formal description of a
A. cut score.
B. predictive yield.
C. norm-referenced test.
D. hit rate.
A. cut score.
Typically, speed tests
A. contain items of a uniform difficulty level.
B. are completed by fewer than 1% of all test-takers.
C. have low validity coefficients.
D. yield high rates of false positives.
A. contain items of a uniform difficulty level.