1/306
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
purposes of psychological testing
classification, diagnosis/treatment planning, research, program evaluation, coaching/training, legal application
4 major types of tests used to assess cognitive ability
intelligence tests, personality tests, interest tests, aptitude tests
3 types of intelligence tests
individually administered tests, group administered tests, neuropsychological assessments
individually administered intelligence test examples + target population
wechsler scales, stanford binet V for children, woodcock-johnson IV for diagnosis, kaufman scale for rapport
group administered intelligence tests examples
ASVAB, TOEFL (Test of english as a foreign language), GAMSAT, UMAT (undergrad medical admissions test), various tests for job selection - SHL assessments of verbal, numerical, and inductive reasoning
Interests in Holland’s vocational interest model
realistic (hands-on), investigative (explorative), artistic (creative), social (cooperative), enterprising (leadership), conventional (detail-oriented)
testing applications for selection, matching…
selection criteria with job requirements (job analysis, write job description, test candidate pool, select candidate)
testing applications for neuropsychology
checklists for frontal lobe dysfunction, ex. Luria-nebraska neuropsychological battery, mini-mental state exam
testing applications for health psychology examples
Mcgill pain questionnaire (rating pain on sensory, affective, evaluative), TWEAK- alcoholism, beck depression inventory, HADS (hospital…)
testing applications for forensic assessment examples
assessment for insanity plea, competency to stand trial, prediction of violence and risk assessment, child custody, personal injury
testing applications for score feedback in training, coaching, and insight principles
tied to everyday activities, profile focus, developmental planning through compensatory strategies (re-shape & externalize) and developmental strategies (practice, coaching, goal, eval)
O*net is a resource for
human resource professionals
individually administered tests pros
personability, assurance of comprehension and focus, oppurtunity to intervene and reform questions
individually administered tests cons
experimenter bias, nerves under time constraint, report bias to prove themselves, high cost, time-consuming
woodcock johnson test consists of both ____ test batteries
achievement and cognitive ability
the NEO-PI-R personality model 5 domains
OCEAN
NEO-PI-R has _ domains, with _ aspects, and _ facets
5, 10, 30
reliability measures…
one and only one thing
reliability (precision) looks at consistency across..
items, time, other sources, and generalizability
validity means the test…
measures what it is supposed to measure
validity gathers evidence from…
item content, response process, internal structure, relationship to other variables (discriminant, convergent, criterion), and consequences of test use
what are the test standards (2014)
recommendations for using and interpreting test scores, developed and distributed by APA, AERA, and NCME
4 components of validity
evidence (empirical observations), theory (meaning of observations within frameworks), interpretation (meaning test users derive from scores), use of tests (test’s purpose and outcome)
validation is the joint responsibility of the…
test developer and test user
changes in validity from 1985 to 1999 to 2014
tripartite and outcomes to unitary form to unchanged
1954 view of validity was….meaning…
criterion view; validity = correlation with criteria with static properties (valid or not valid)
problems with criterion view of validity
not always one obvious criterion variable, some tests for different purposes in different groups, validity is dependent of test taker characteristics and test purpose/use
1966 + 1985 tripartite view of validity 3 components
criterion validity, content validity, construct validity
2 aspects of criterion validity
concurrent (criterion measured as same time as test administered) and predictive (criterion measured as some time after test administered)
2 aspects of construct validity
convergent (theoretically related concepts show empirical relationships) and discriminant (theoretically unrelated concepts show no empirical relationships)
construct validity (cronbach and meehl 1955) is the idea of…
nomological network (the interlocking system of laws which constitute a theory)
problems with construct validity
questions if when validity doesn’t find expected outcomes, is theory mispecificed or is test invalid? and no clear specification of how to test this
problems with tripartite view of validity (1966)
too much emphasis on validity in diff forms (distinction btwn convergent and concurrent not always clear), over-emphasis on correlations as proof, no explicit mention of the test use and consequences
validity in 1999/2014 standards is a property of the…
interpretation of test scores, not the test scores themselves
test content for validity refers to
relevance and representativeness
response process of validity refers to
if test is intended to capture a particular process, evidence should show that test does measure this process (ex. eye-tracking)
internal structure of validity means
# of subcomponents found empirically matches # subcomponents theoretically expected
relationship to other variables in validity consists of
convergent and discriminant evidence, test criterion relationships, validity generalization (rep for diff populations, conditions, and purposes)
intended and unintended consequences of testing def
consider consequences of testing (which can be unforseen by the test developer and test user)
4 factors affecting reliability
people taking the test, item characteristics, test characteristics, method used to estimate reliability
people taking the test factor of reliability
reliability as variability btwn people, match btwn person-level and test-level (floor and ceiling effect)
item characteristics factor of reliability
2 item characteristics affect internal consistency reliability, a reliable test must have many items with small correlation or few items with strong correlation
test characteristics factor of reliability
bandwidth (amount of info) vs. fidelity (accuracy of info) - more specific = higher reliability but don’t sacrifice content coverage for to get reliability
method used to estimate reliability factor of reliability characteristics
internal consistency, alternate forms, test-retest
reliability being good enough depends on the…
purpose of testing (research 0.6-0.7, screening 0.8, diagnosis 0.9)
the max correlation btwn 2 variables determined by their…
reliabilities
a test must be reliable in order to interpret..
any evidence of validity
reliability increases as..
# of items increases (assuming similar quality across items)
why not to keep increasing test length?
boredom, exhaustion, low motivation
for best reliability, make test ___ as possible…
short; within acceptable reliability heuristics
2 solutions to short, reliable tests
adaptive testing (test adapts to person ability level as they go on) and computerized adaptive testing
advantages of computerized adaptive testing (CAT)
shorter tests with strong reliability (cheap/no issues with motivation, etc), test security maintained easier, motivation factors
CAT is most appropriate for …
large-scale testing where test security is an issue
disadvantages of CAT
substantial prep and outlay needed (large item pool development and analysis, automated programming algorithm), requires computerized administration
people may disort their reponses in…
high stakes situations
3 forms response distortion
self-deceptive enhancement, self-deceptive denial, impression management (conscious)
egoistic bias (value=agency) linked to
self-deceptive enhancement
moralistic bias (value=communion) linked to
self-deceptive denial
faking good often occurs in
employment selection, education selection, dating evals
faking bad often occurs in
legal context, edu context, military
4 methods to detect faking
lie scales, response time rubrics, over-claiming technique, bayesian truth serum
5 methods to reduce faking
forced choice format, verifiable statements, other-reports, warnings, implicit measurement techniques
lie scales meaning and examples
statements everyone must’ve done once, if not youre lying; marlowe crown social desirability scale + MMPI scale
lie scale problems
relate to substantive personality traits (could measure actual aspects of personality)
in over-claiming, it compares
real and foil terms to see if people are overclaiming
bayesian truth serum is
test takers estimate the proportion of people who would answer the same way, often over-estimate
5 types of warning to reduce faking; most effective one
detection, consequences, reasoning, educational, moral; consequences
implicit measurement technique to reduce faking ex
implicit associations test (unconscious associations)
3 paradigms in faking research
group comparison, instructed faking, incentive manipulation
group comparison of faking meaning
compare job applicants to others, must measure lower limit of faking bc could be real group differences
instructed faking
comparing scores under “answer honestly” and “max your score”
incentive manipulation in faking
compare scores under no stakes and stakes to do well
5 reasons for measuring job performance
decision making about individuals, organization planning, legal requirements in jobs, feedback, eval procedures or changes
5 subjective measures of job performance
graphic rating scales, behaviorally anchored ratings scales (BARS), behavioral observation scale, checklists, narratives
graphic rating scales pros and cons
pros: simple, easy, time efficient, quant comparisons, flexible (many jobs)
cons: lack of context, easy to give bias, prone to rating errors, limited behavioral specificity, not a lot of info for development/feedback
behaviorally anchored rating scales (BARS) pros and cons
pros: reduced ambiguity (observation basis), high content validity, reduced likelihood of common rating errors, more info with clearer benchmarks (internal consistency)
cons: time consuming, costly, lacks generalizability, can’t capture internal processes, oversimplify complex performance
behavioral observation scale pros and cons
pros: improved objectivity, high content validity, reduced rating biases, useful for feedback and development
cons: frequency doesn’t equal effectiveness, requires frequent observation, can’t eval internal processes, time consuming, costly
checklists pros and cons
pros: reduced rating biases, improved objectivity, high content validity
cons: too little differntation, ignores quality, inconsistently checking bc long, limited usefulness for feedback, no context
2 types of objective measures of job performance
production counts and biodata
problems with objective data (for job perform)
production counts not always possible, not counting quality, production dependent on situational variables
6 sources of error in rating scale data
social desirability, leniency/severity errors, “halo/horns” effect, recency effects, causal attribution errors, personal biases
2 parts of performance appraisal
assessment and feeback
8 performance feedback principles
descriptive, specific, appropriate, directed to changeable behaviors well timed, honest, understood, proactive
2 measures of job satisfaction with examples
global measures and specific measures (minnesota satisfaction questionnaire & job descriptive index)
job satisfaction shows a ___ relationship with job performance
small (0.3)
burnout common assessment
maslach burnout inventory
3 types of factors to increase job satisfaction
work (rotation, enlargement, enrichment), pay (fairness, skill-based, merit-based, profit sharing), hours/flexibility (compressed work week, flextime)
Hunter & Schmidt (1998) equation
savings/employee/year = r (according to table in reading for test) * SD (percent of salary for job)* Z (top percent of ppl hired in p to z calc)
Kunter (2013) two hypotheses abt aptitude for teaching
individual aptitude for teaching (born to teach), qualification hypothesis (trained to teach)
3 steps/factors used in initial teacher evaluation (ITE) selection
review of background, eval of cognitive factors, eval of social and emotional characteristics
how much is known about the fairness and predictive validity of teacher selection methods
very little (effect size .12)
6 stages of research supported selection process in teaching
identify and prioritize selection criteria, eligibility checks, screening, intensive selection, selection into ITE program, monitoring outcomes
likert scales can differ in
target construct, number of points, description of points, source of report — easy to administer and fake
2 sources of assessment
other and self ratings
2 frames of reference (in assessment)
general and context
using multiple sources/theories/info to increase credibility and validity of findings is often through
triangulation (self report, other report, objective data)
2 alternative assessment tools
situational judgement tests (SJT), mini multiple interviews
SJT def and focus
measurement method designed to assess judgement in work-relevant situations written or in video; focus on social and emotional attributes
pros and cons of SJT
pros: high predictive validity, high fairness, lower fakeability, preferred by candidates, low cost
cons: construct validity, initially labor intensive to create
multiple mini interviews def and assumption
interviews rotating through 8-12 stations for 7-10 mins each on various tasks (SJT, behavioral interview, unstructured interview); assumes greater sampling of behaviors provides more info about the suitability of candidate and reliability of test