Psychological Assessment Exam 3

Validity: measuring what you think

Reliability: how close to the true score are we likely to be 

Practically: Does it make sense to apply it to this setting? Is it worth it? Related to utility 

Cross-Sectional Fairness: Is it accurate for someone in this group? Across other tests, do you get the same answer. Absence of test or assessor biases

  • Correlation: is the degree of the relationship between 2 variables 

  • Separate from causation 

  • (relationship could reflect causation, but need experiment to know for sure)

  • Negative, Positive, Strong and Weak correlations 

  • No Correlation: no relationship; no pattern, random, no meaningful predictors

  • Info about scatter plots:

  • Some graphs don’t have a line of best fit because the data has no correlation 

  • The closer to -1 or +1, the STRONGER the relationship better a your prediction because it’s more accurate

  • The closer to 0 the weaker the relationship!

Correlation does not equal causation!!


3 types of validity:

2. Criterion related validity: relationships between scores and other measures. Do the scores predict the performance on a criterion?

  • Crit= outcome measure of interest

  • Crit pre-requisites: must be VALID and noncontaminated (meaning it’s independent can't share any items) 

  • Two Types of Criterion Related Validity: 

Concurrent Validity: can predict score now (very quickly) Ex: 100 clients take the Beck Depression Inventory Test (BDI). 500 people take the alcohol test 

Predictive Validity: can predict scores/performance in the future. Ex: using your current college gpa to determine your insurance claim. Basing it on potential risk of you crashing based on how low or high your gpa is. SAT scores correlated w/ college GPA. GRE correlated to graduation rate.

  • Both are about predicting

  • Predictive validity related stats: expectancy tables and standard error estimate (check the handout)

SEest: average distance from the regression line 

High r: strong relationship between measure (test score) and what you are predicting. no one is that far off, predict score on the line and you will be close 

Low r: weak relationship is between measure (test score) and what you are predicting 

No r: no relationship between measures 


  • Standards fo prediction varies

  • Set cut off scores that optimize ‘hit rate’ for situation 

Hit rate: hits/ hits + misses

  • Hit: accurate predicted classification 

  • Miss: false negatives and false positives

  • False positives: predict high/pass/trait but not 

  • False negatives: predict the absence of something but it‘s actually there


Construct Validity  

Construct: theoretical, intangible quality people vary on (ex:intelligence, leadership, psychopathy, anxiety, hostility, and self esteem 

  • We infer that these qualities are real and that they exist. We try to group together predictable patterns of behavioral characteristics over related items  

  • (construct is the assumed reason for the pattern)

  • Construct validity asks…is your quality measurable? and is this an accurate measure of it? 

  • This is broader in comparison to content or criterion validity 

  • Convergent Validity: scores highly as expected with other tests (positive or negative)

  • Ex: on older, established tests of contract or retaliated measures

  • Discrimination Validity: scores show little or no relationship to those that the theory predicts they should not be relatable


Reliability is about consistency: do you get the same results after each test?

  • It implies that there’s very little error and it’s near to the true score


Reliability Coefficient: a stat that quantifies reliability. Ranges from 0 (not reliable) to 1 (reliable)

  • Classic test Theory: Spearman 1904. It response theory probability of getting an item correct should be related to item difficulty and overall skill level 

*** Variance of T/ variance of T + error variance 

  • Reliability is a measure of the variability of true scores divided by the variability of the observed scores. (True + Error)

  • Closer to a value of zero 

  • Random error can be unpredictable: environmental problems (temperature), examine state (sleepy, to feeling well), administration error, rapport issues, test score errors, judgment errors 

  • Standardization (define it)


Different Ways to Measure Reliability:

  • Test Retest (The gold standard): correlates core from the same test given at different time 

  • Possible issues…practice effects, look up answers 


Alternate Forms:

  • Same content tapped differently but equally 

  • correlate people’s scores on both versions

  • harder than you might guess 


Interscorer (interjudge/interater)

  • Same people different administrators

  • *Use correlation coefficient 


Split Half (odd-even):

  • try to divide into 2 equally difficult halves, correlate the scores

  • useful if the cost practice effects woul impact test-retest (especially if impacts some more than others

  • tend to be lower b/c shorrrter can correct for thay nit still issued w/ how Ro pick your halves…led to 


Coefficient Alpha:

  •  mean of all possible split halves 


Inter-Item Consistency:

  • degree of o correlation among all items


.9 or .95 is the goal for most tests

as low as .7 can be accepted 

  • research sometimes accepts even lower values 

  • should find stats in the lower test manuals

Are the items homogeneous or heterogeneous in nature 




UNIT 3: CHAPTERS 8 AND 9

The Assessment of Ability

  • Aptitude: estimating your potential learning 

  • Achievement Tests: what you already know 

  • Intelligence…

  • to learn from experience 

  • to acquire knowledge 

  • to recognize and solve problems/adapt to environment 

  • to think abstractly, to reason, to understand complexity 

  • and the speed you learn and gather data

  • Others include additional concepts: interpersonal skills, morality, compassion, loyalty 

  • most agree nature and nurture contribute (modifiable within LARGE genetic based limits 

  • Eastern Cultures Emphasize: benevolence, humility, doing what’s right 

  • African Cultures Emphasize: maintaining harmonious and stable intergroup relations 

  • Western Culture Emphaize: Generally expanding 

1.Factor analytic theories (psychometric approach): identify the ability or groups of abilities that constitute intelligence (if more than one, are related?)

  • Factor analysis: correlational technique allowing us to identify clusters of items that Are related, possibly indicating a meaningful concept 

  • Many exist; some are over 100 years old

  • Single trait v.s. Independent trait v.s. Hierarchical 

2. Information Processing theory/model: identifies specific mental processes that are applied during problem solving (how we present the processed info, these models are not interested in what we process)

  • Metacognition: thinking about thinking 


  • Spearman’s Two Factor Theory (early 1900’s)

  • little ”g”, “g factor” or “g”

  • ”g” is influencing everything 

  • little g is the ability to reason and solve problems: impacts everything, explains high correlations among skills

  •  General intelligence: electrochemical mental energy or power (CS), physiological integrity of NS (SMS) (the efficiency and effectiveness of the NS

  • The best measure is the abstract reasoning problems for little g 

  • Best measure: nature and nurture, experience, trauma, SES, being socially isolated 

  • S Factors: the ability to excel in certain areas; specific intelligences (art, music, art, business, etc) the two factor model was too simple 

  • Group Factor: an intermediate construct that impacts must have but not all skills. Not general g or specific s  

  • Spearman v.s. Thurstone

  • Thurstone Believed that in Primary Mental Abilities; believed that there was seven independent factors…later found out that these traits are connected 

  • 1. Verbal Comprehension, 2. Word Fluency, 3. Number, 4. Space, 5. Associative Memory, 6. Perceptual Speed, 7. Inductive Reasoning 

  • Most people agree with the hierarchical model

Cartel-Horn-Carroll (CHC Theory)

  • Started with Cattek (40’s)

  • Revised by Horn (68)

  • Expanded by Carroll (1993)

  • Factor analytic theory is dominant method 

  • Crystallized Intelligence: reflects ability to accumulate knowledge use verbal skills uses strategies (like school info) a repository, culture/schooling dependent. Somebody else taught you how to do this 

  • Fluid Intelligence: used when reasoning, ability to solve problems, see relationships, to reason abstractly, and spatial. Nonverbal, free of instruction, independent of culture 

  • People argue that fluid intelligence is best practiced through little g

  • Synthesizing most other factors results in one model. Good overall but still refining details 

  • Planning:  selection, use and monitoring of effective problem solving strategies. Ex: using feedback 

  • Attention: receptivity to info Ex: ability to keep attention and block out distractions 

  • Simultaneous: parallel processing; think and perceive arr the same time Ex: drawing, order doesn't matter 

  • Successive: sequential processing; thinking that requires a specific order. Ex: language and calculations

  • Many things take simultaneous and successive learning. Can be difficult to do a task that tackles just one 

  • Cognitive Assessment System attacks all 4^^


New Theories:

  • Gardner: Theory of multiple intelligences. Howard Gaardner thought intelligence was too vast and too complex to measure as we do. Initially came up with 7 forms of intelligence and then added throughout the years. (7, 8, 9, 10)

  • Sternberg: Triarchic Theory of Successful Intelligence 

  • Both felt traditional IQ tests were too limited 

  • Expanded to include things like creativity, music, people skills, body intelligence 

  • The information processing approach is more fair

  •  Gardner and Sternberg thought intelligence tests were too limited 

  • Existentialists: sensitivity and capacity to tackle deep questions about the meaning of life, human existence and why we die.

  • Pedagogical: understanding how learning happens and knowing how to shape one's own learning to help others learn 


Evaluation:

  • people like this method because it seems fair and optimistic 

  • Most psychologists recognize the value of expanding "intelligence" with the multiple theory of intelligence track 

  • Also look at the broader


Sternberg’s Trarchic Theory of Successful intelligence:

  1. Analytical: ability to acquire knowledge and break problems into parts. Problem solving 

  2. Creative: ability to think outside the box. Also tracks problem solving. The speed of learning new things 

  3. Practical: also known as emotional intelligence; social skills, common sense, can you “read the room,” adapt an shifting to your environment 

  • Now he talks about wisdom. Adaptive Intelligence includes all 4** 


Intelligence Quotient (IQ): how you compare to others your age on a specific intelligence test. Note: IQ’s from most accepted tests are correlated

  • Does it = intelligence 

  • It’s your current best estimate, it is useful but it’s not w/o problems. (need to know test limits and possible administration limits)  

  • Taking the test and seeing how they compared to other people in their age group

  • We expect people to gain more knowledge as they age so that’s why we compare intelligence based on age 

  • Does it = intelligence? This is our current best estimate but it is abstract. Know the limits.  Be aware of the strength and weaknesses 


Why are we testing intelligence?

  • To assist people who are struggling to who is excelling 

  • Used to identify intellectual disabilities/learning disorders 

  • Guide admission to private schools

  •  Helps predict academic success

  • Helps predict occupational success 

  • It DOES NOT predict mental health 

  • Assess intellectual ability after brain injury 

  • To assess/document impairment and or legal competence 

  • To diagnosis disability — access to benefits 

  • Personality testing as a part of full assessment for diagnosis and/or therapeutic intervention 

  • For guidance in career


  1. Test people on items assumed to tap into intelligence (items vary. Per test based on theory, typical broad range)

  2. Raw scores are converted too In scores based on raw score distribution for age. (Scores fall along normal curve; can manipulate test if not)

  • arbitrarily but consistently use to 100 as mean with a SD of 15. Called a deviation IQ (new way)


Old way…

(ratio).    IQ = MA/CA * 100


MA = Mental Age (what you score like; could be lower, higher or average) 

CA =Chronological Age (Actual Age)


WE ONLY USE DEVIATION NOW**


  • We’re moving away from archaic labels like “superior” ,”boarder line” “idiot” “imbecile”, “moron”, “feeble minded” and “mental retardation” and using less harsh identifiers.

Verbal Comprehension: ability to access and find words, to retrieve info

Perceptual Reasoning: 

Working Memory:

Processing Speed:

  • Standardization (2200 individuals and cooperative healthy; used race, gender, age, etc); Reliability, Validity 

  • SB5: Stanford Binet 5: Average 100 SD: 15. 5 factors, 2 domains 


The Cognitive Assessment System II:

  • Ages 5-18

  • 12 subsets, full scale and 4 process scores (based on PASS)

  • Stratified US sample 2200

  • Acceptable reliability and Validity 

  • White/Black race differences smaller than traditional tests: controlling for variables like SES. Correlate with grades high or both groups. Use may help decrease over rpep of black children in special ed

  • There’s typically a 15 point difference between black and white Americans. 


Non-verbal Tests of Intelligence:

  • Comprehension Test for Non-verbal, Peabody Picture, 

  • Raven’s Progressive Matrices (they say this is the purest measure of “g”). This test is supposed to be culture free. For ages 5+. You have to choose the missing piece 

  • Flynn Effect (Group) apparently “we are smarter than our parents”

  • Stability over age (Individual)

  • Flynn Effect: Increase in average IQ score; about 3pts/decade (then they retake the test); biggest jump - prob solve not general knowledge

  • Start off as measured raw scores

  • Environmental impact on IQ

  • Better nutrition, parental care, medical care 

  • Overall environ changes, teaching styles overall ed experiences, tech advances, complexity of life smaller families, more time in cognitive based leisure activities 

  • Older tests you seem smarted. New test you seem less smart. As time progresses society learns new things. Old test get outdated and we learn from historical mistakes 


Infant/Childhood trend. 

  • Early intervention is useful to identify development delays . Measure multiple domains: cognitive, social and motor

  • We don’t identify if a young child is gifted because infants change so rapidly so they may easily fall behind as fast as they excel

  • By age 4 you can predict adolescence and then when they’re middle child to teenage years then you can better determine their intellectual level as an adult 

  • Crystalized skills are maintained much longer than fluid skills. Fluid maxes around 20’s and 30’s, slows and faster around 75+. Crystallized maintained much longer 

  • If good health, older = wiser. see multiple perspectives; respect — know their limits, less distorted by negative emotions 


Contribution to Intelligence:

  • Strong evidence nature (genes) and environment matters 

  • Comparing Twin research, Comparing siblings and comparing non-related siblings. They test these groups while they’re really young and close in age

  • Adopted kids scores correlate less with adopted family and more with Bo family overtime (IQ and verbal measures)

  • Similarity between identical twins increases over time 

  • 2+3 are NOT saying environment don’t matter, the correlations do not become 1.0 but these shifts over time are certainly thought provoking 

  • Genetic hypothesis for group differences >> No real support 

  • Standard IQ for races: Asian, White, Native/Latin, Black

  • A large part of that is due to SES^

  • Social Identity threat: if you are reminded of your group membership then you are more likely to score along the lines of the stereotypes that’s attached to your identity. (Ex: whites scoring higher if they’re reminded they have higher IQ test scores and the opposite for black Americans)

  • Within groups there are generation differences (ex:Flynn and average IQ for blacks is increasing more than whites)

  • Given the same info Blacks and Whites show similar info processing skills (PASS-CAS) (FAGAN)

  • In different eras, different ethnic groups have experienced  of remarkable