2.8b: Psychometrics + Intelligence

Psychometrics: scientific study of the measurement of human abilities, attitudes, and traits

an approach to defining intelligence that attempts to measure intelligence w/ carefully constructed psychological tests

Intelligence Tests

Intelligence tests: a method for assessing an individual’s mental aptitudes and comparing them w/ those of others, using numerical scores

intelligence tests are designed by psychometricians

⬇ Modern Tests

Achievement tests: a test designed to assess what a person has LEARNED

Ex: AP exam

Aptitude tests: a tests designed to predict a person’s future performance; aptitude is the capacity to learn

aptitude supports achievement
measures POTENTIAL ABILITY
Ex: SAT

Francis Galton - Eugenics

Eugenics: the discriminatory 19th and 20th century movement that proposed measuring human traits and encouraging only those deemed “fit” to reproduce

early attempt to measure natural ability

Alfred Binet & Simon

→ Alfred Binet leaned toward an environmental explanation of intelligence differences

developed questions to predict children’s future progress
meant to help children w/ special needs

Mental age: a measure of intelligence test performance devised by Binet; a child’s performance is matched to the chronological age they act/think like

→ Binet’s Mental Ability Test: the first practical, standardized intelligence test designed to identify French school children needing special education support

Lewis Terman

→ Revised Binet’s work for use in the U.S — assumed intelligence tests revealed a mental capacity present from birth

⬇

Stanford-Binet: the widely used American revision (by Terman at Stanford) of Binet’s original intelligence test

Intelligence Quotient (IQ): defined originally as the ratio of mental age to chronological age multiplied by 100

does not work well w/ adults

David Wechsler

Wechsler Adult Intelligence Scale (WAIS): contain verbal and performance (nonverbal) 15 subtests that scores for: verbal comprehension, perceptual organization, working memory, & processing speed

WISC (Wechsler Intelligence Scale for Children)
reliable intelligence test

Test Construction

A good test is:

standardized
valid (predictive, content, construct, or face)
reliable

→ A test that is valid must be reliable

→ A test can be reliable but not valid

Standardization: defining uniform testing procedures and meaningful scores by comparison w/ the performance of a protested group

→ 2 steps:

Uniform procedures:
- same instructions
- same time limits
- same test questions
Test norms
- provide info about where a school ranks in relation to other scores
- how? pretest a rep sample
- norms need to be updated sometimes

Validity: the extent to which a test measures/predicts what it claims to measure

accuracy

⬇

→ Types of Validity:

Construct Validity: used to ensure the test is actually measuring what it is intended to measure and not extraneous factors
- Construct in research: a concept/abstract idea that cannot be directly observed but it inferred from observable behaviors (eg. happiness, motivation)
- Ex: a survey measuring “social anxiety” — it asks abt fear of social situations
Content Validity: how well a measure reflects the entire range of material it is supposed to be testing
- Ex: a personality test measuring the Big Five Traits — if all 5 traits are included, the test has high content validity for assessing broad personality dimensions
Face Validity: whether a test appears to measure what it’s supposed to measure — concerned w/ whether a measure seems relevant and appropriate for what it’s assessing only on the surface

→ Criterion-Related Validity:

⭐️ Predictive Validity: the success w/ which a test predicts the behavior it’s designed to predict
- When the criterion measures are obtained at a time after the test
- Ex: does the AP exam predict how well an AP student would do in an introductory college course in Psychology?
Concurrent Validity: a method that involves comparing a new test w/ an already existing test, or an already established criterion
- When criterion measures are obtained at the same time as test scores, indicating the ability of test scores to estimate an individual’s current state
- Ex: aptitude test for pilots — do pilots’ scores on test correlate w/ performance ratings of their supervisors?

Reliability: the degree to which a test produces the same scores over time

measured using correlation
consistency — a test is reliable if it yields the same results over time

⬇

→ Methods for Measuring Reliability

Split-half testing: test split into 2 parts and an individual’s scores on both halves compared
- agreement of odd-numbered question scores and even-numbered question scores
Test-retest