Study Notes on Intelligence Chapter 9 (Part 2)
Standardization of IQ Tests
Definition of Standardization
Standardization is defined as the process of establishing rules for both administering a test and interpreting the scores.
Determination of Norms
A crucial step in standardizing a test is the determination of the norms.
Norms are descriptions of how frequently various scores occur within a population.
Representative Population
Standardization must be grounded in a large and representative population, closely mirroring the population that will ultimately be tested.
Distribution of Scores
Scores on an IQ test typically form a bell-shaped curve, known as a normal distribution.
For the Wechsler IQ test:
Mean score: 100
Standard deviation: 15 (indicating a spread of 15 points above and below the mean)
For the Stanford-Binet test:
Mean score: 100
Standard deviation: 16 (resulting in a slightly wider spread).
Restandardization and the Flynn Effect
Need for Periodic Recalculation
Every test necessitates periodic recalibration of norms and the revision of test items to ensure relevance and accuracy.
The Flynn Effect
The periodic restandardization of IQ tests has uncovered an intriguing phenomenon known as the Flynn Effect.
The Flynn Effect has been observed across the globe and within all ethnic and racial groups.
Possible Explanations for the Flynn Effect
Several hypotheses have been proposed to explain the rise in IQ scores associated with the Flynn Effect:
Improved test-taking skills among the population
Better educational access and quality
Enhanced health and nutrition
Increased opportunities for visual-spatial stimulation (such as through TV and video games)
Heterosis (the increased vigor and survival of heterozygous individuals).
It is crucial to understand that the rise in IQ scores does not definitively indicate an increase in actual intelligence levels.
Reliability of Tests
Definition of Test Reliability
A test’s reliability is defined as its freedom from random error and is commonly assessed based on the repeatability of its scores.
Correlation Coefficient
Psychologists use the correlation coefficient to estimate a test’s reliability quantitatively.
Methods for Assessing Reliability
Common methods to assess reliability include:
Test-retest: Measuring consistency of test scores over time.
Parallel forms: Comparing scores from different versions of the same test.
Split-halves: Dividing the test into two halves and correlating the scores from each half.
Coefficient alpha: A measure of reliability based on the correlations between items in a test.
Validity of Tests
Definition of Test Validity
Validity refers to the extent to which a test measures what it claims to measure, thus determining its suitability for a particular purpose.
Relationship between Reliability and Validity
Reliability sets the upper limit of validity, meaning that a test cannot be valid unless it is reliable.
Types of Validity Evidence
Various types of validity evidence include:
Content validity: The extent to which a test covers the representative content of the subject it aims to measure.
Construct validity: The degree to which a test accurately measures a theoretical construct or trait.
Predictive validity: The extent to which a test can predict future performance or outcomes, though a common problem is range restriction.
Tests may possess both reliability and validity but could still lack practical utility.
Group Differences and Test Bias
Differences observed between groups do not necessarily indicate the presence of test bias.