2.8b: Psychometrics + Intelligence
Psychometrics: scientific study of the measurement of human abilities, attitudes, and traits
an approach to defining intelligence that attempts to measure intelligence w/ carefully constructed psychological tests
Intelligence Tests
Intelligence tests: a method for assessing an individual’s mental aptitudes and comparing them w/ those of others, using numerical scores
intelligence tests are designed by psychometricians
⬇ Modern Tests
Achievement tests: a test designed to assess what a person has LEARNED
Ex: AP exam
Aptitude tests: a tests designed to predict a person’s future performance; aptitude is the capacity to learn
aptitude supports achievement
measures POTENTIAL ABILITY
Ex: SAT
Francis Galton - Eugenics
Eugenics: the discriminatory 19th and 20th century movement that proposed measuring human traits and encouraging only those deemed “fit” to reproduce
early attempt to measure natural ability
Alfred Binet & Simon
→ Alfred Binet leaned toward an environmental explanation of intelligence differences
developed questions to predict children’s future progress
meant to help children w/ special needs
Mental age: a measure of intelligence test performance devised by Binet; a child’s performance is matched to the chronological age they act/think like
→ Binet’s Mental Ability Test: the first practical, standardized intelligence test designed to identify French school children needing special education support
Lewis Terman
→ Revised Binet’s work for use in the U.S — assumed intelligence tests revealed a mental capacity present from birth
⬇
Stanford-Binet: the widely used American revision (by Terman at Stanford) of Binet’s original intelligence test

Intelligence Quotient (IQ): defined originally as the ratio of mental age to chronological age multiplied by 100
does not work well w/ adults

David Wechsler
Wechsler Adult Intelligence Scale (WAIS): contain verbal and performance (nonverbal) 15 subtests that scores for: verbal comprehension, perceptual organization, working memory, & processing speed
WISC (Wechsler Intelligence Scale for Children)
reliable intelligence test

Test Construction
A good test is:
standardized
valid (predictive, content, construct, or face)
reliable
→ A test that is valid must be reliable
→ A test can be reliable but not valid
Standardization: defining uniform testing procedures and meaningful scores by comparison w/ the performance of a protested group
→ 2 steps:
Uniform procedures:
same instructions
same time limits
same test questions
Test norms
provide info about where a school ranks in relation to other scores
how? pretest a rep sample
norms need to be updated sometimes
Validity: the extent to which a test measures/predicts what it claims to measure
accuracy
⬇
→ Types of Validity:
Construct Validity: used to ensure the test is actually measuring what it is intended to measure and not extraneous factors
Construct in research: a concept/abstract idea that cannot be directly observed but it inferred from observable behaviors (eg. happiness, motivation)
Ex: a survey measuring “social anxiety” — it asks abt fear of social situations
Content Validity: how well a measure reflects the entire range of material it is supposed to be testing
Ex: a personality test measuring the Big Five Traits — if all 5 traits are included, the test has high content validity for assessing broad personality dimensions
Face Validity: whether a test appears to measure what it’s supposed to measure — concerned w/ whether a measure seems relevant and appropriate for what it’s assessing only on the surface
→ Criterion-Related Validity:
⭐️ Predictive Validity: the success w/ which a test predicts the behavior it’s designed to predict
When the criterion measures are obtained at a time after the test
Ex: does the AP exam predict how well an AP student would do in an introductory college course in Psychology?
Concurrent Validity: a method that involves comparing a new test w/ an already existing test, or an already established criterion
When criterion measures are obtained at the same time as test scores, indicating the ability of test scores to estimate an individual’s current state
Ex: aptitude test for pilots — do pilots’ scores on test correlate w/ performance ratings of their supervisors?
Reliability: the degree to which a test produces the same scores over time
measured using correlation
consistency — a test is reliable if it yields the same results over time
⬇
→ Methods for Measuring Reliability
Split-half testing: test split into 2 parts and an individual’s scores on both halves compared
agreement of odd-numbered question scores and even-numbered question scores
Test-retest