AP PSYCH 5.10 Psychometric Principles and Intelligence Testing
Real intelligence tests need to have at least three traits
Standardization, or what an individual’s score is compared against
Reliability, the stability of the score over time
Validity, the ability of a test to measure what is intended
Standardization starts with giving many sample pretests to many people before the test is administered officially
From those pretests, we can see the number of answers people get right
A pattern will hopefully start to develop
Few people get most questions right
Most people get some questions right
Few people get most questions right
Tests can be designed with different intentions in mind so this may look a little different
An test designed to be extremely hard will want to see most people doing poorly
A test designed to be very easy will want most people to get many answers right
If you take an intelligence test, then take a different version of the same test, you should get a similar score
If the same person gets wildly different scores on a test meant to be the same difficulty, it is not reliable
A good test must correlate with another version of the test to be reliable
The split-in-half method helps to ensure testing correlation
If a tester does better one one part of a tests than another part, the test is not correlated with itself
The higher the self-correlation, the higher the reliability
Validity is the most important issue in the formation of a test
Just because a test is reliable does not mean that it is valid
A broken scale might be reliable-- It gives the same weight every time-- But it is not valid
Validity means to measure what is intended
The extent to which a test samples the behavior that is of interest
What you think of when thinking of validity
Similar to operationalization
How well an abstract idea is translated into something measurable
Correlation to an outside measure
If a test claims someone is a genius, but they can’t tell left from right, there might not be criterion validity
The success with which a test predicts the behavior it is designed to predict
Assessed by computing the correlation between test scores and the criterion behavior
The SAT has had many problems with predictive validity
For a long time, it was a horrible measure if you would actually do well if college or not
There was a period of time where no colleges accepted it because its predictive validity was so distorted
The SAT has undergone many edits to improve this metric
Remember that the normal curve is a bell shaped curve, an ideal distribution of scores
IQ scores fall into a normal curve
Few people get most questions right
Most people get some questions right
Few people get most questions right
This pattern is standard
Not all scores are average, some deviate from the standard
100 is the average score
85 to 100 or 100 to 115 is one standard deviation away
A standard deviation is 15 points
68% of people fall within one standard deviation of 100 (85-115)
95% of people fall within two standard deviations (70-130)
Beyond two deviations is unusual
Having an IQ score below 70 is considered an intellectual disability
Having an IQ above 130 is considered gifted
Real intelligence tests need to have at least three traits
Standardization, or what an individual’s score is compared against
Reliability, the stability of the score over time
Validity, the ability of a test to measure what is intended
Standardization starts with giving many sample pretests to many people before the test is administered officially
From those pretests, we can see the number of answers people get right
A pattern will hopefully start to develop
Few people get most questions right
Most people get some questions right
Few people get most questions right
Tests can be designed with different intentions in mind so this may look a little different
An test designed to be extremely hard will want to see most people doing poorly
A test designed to be very easy will want most people to get many answers right
If you take an intelligence test, then take a different version of the same test, you should get a similar score
If the same person gets wildly different scores on a test meant to be the same difficulty, it is not reliable
A good test must correlate with another version of the test to be reliable
The split-in-half method helps to ensure testing correlation
If a tester does better one one part of a tests than another part, the test is not correlated with itself
The higher the self-correlation, the higher the reliability
Validity is the most important issue in the formation of a test
Just because a test is reliable does not mean that it is valid
A broken scale might be reliable-- It gives the same weight every time-- But it is not valid
Validity means to measure what is intended
The extent to which a test samples the behavior that is of interest
What you think of when thinking of validity
Similar to operationalization
How well an abstract idea is translated into something measurable
Correlation to an outside measure
If a test claims someone is a genius, but they can’t tell left from right, there might not be criterion validity
The success with which a test predicts the behavior it is designed to predict
Assessed by computing the correlation between test scores and the criterion behavior
The SAT has had many problems with predictive validity
For a long time, it was a horrible measure if you would actually do well if college or not
There was a period of time where no colleges accepted it because its predictive validity was so distorted
The SAT has undergone many edits to improve this metric
Remember that the normal curve is a bell shaped curve, an ideal distribution of scores
IQ scores fall into a normal curve
Few people get most questions right
Most people get some questions right
Few people get most questions right
This pattern is standard
Not all scores are average, some deviate from the standard
100 is the average score
85 to 100 or 100 to 115 is one standard deviation away
A standard deviation is 15 points
68% of people fall within one standard deviation of 100 (85-115)
95% of people fall within two standard deviations (70-130)
Beyond two deviations is unusual
Having an IQ score below 70 is considered an intellectual disability
Having an IQ above 130 is considered gifted