Study Notes on Intelligence Chapter 9 (Part 2)

Definition of Standardization
- Standardization is defined as the process of establishing rules for both administering a test and interpreting the scores.
Determination of Norms
- A crucial step in standardizing a test is the determination of the norms.
- Norms are descriptions of how frequently various scores occur within a population.
Representative Population
- Standardization must be grounded in a large and representative population, closely mirroring the population that will ultimately be tested.
Distribution of Scores
- Scores on an IQ test typically form a bell-shaped curve, known as a normal distribution.
- For the Wechsler IQ test:
- Mean score: 100
- Standard deviation: 15 (indicating a spread of 15 points above and below the mean)
- For the Stanford-Binet test:
- Mean score: 100
- Standard deviation: 16 (resulting in a slightly wider spread).

Need for Periodic Recalculation
- Every test necessitates periodic recalibration of norms and the revision of test items to ensure relevance and accuracy.
The Flynn Effect
- The periodic restandardization of IQ tests has uncovered an intriguing phenomenon known as the Flynn Effect.
- The Flynn Effect has been observed across the globe and within all ethnic and racial groups.
Possible Explanations for the Flynn Effect
- Several hypotheses have been proposed to explain the rise in IQ scores associated with the Flynn Effect:
- Improved test-taking skills among the population
- Better educational access and quality
- Enhanced health and nutrition
- Increased opportunities for visual-spatial stimulation (such as through TV and video games)
- Heterosis (the increased vigor and survival of heterozygous individuals).
- It is crucial to understand that the rise in IQ scores does not definitively indicate an increase in actual intelligence levels.

Definition of Test Reliability
- A test’s reliability is defined as its freedom from random error and is commonly assessed based on the repeatability of its scores.
Correlation Coefficient
- Psychologists use the correlation coefficient to estimate a test’s reliability quantitatively.
Methods for Assessing Reliability
- Common methods to assess reliability include:
- Test-retest: Measuring consistency of test scores over time.
- Parallel forms: Comparing scores from different versions of the same test.
- Split-halves: Dividing the test into two halves and correlating the scores from each half.
- Coefficient alpha: A measure of reliability based on the correlations between items in a test.

Definition of Test Validity
- Validity refers to the extent to which a test measures what it claims to measure, thus determining its suitability for a particular purpose.
Relationship between Reliability and Validity
- Reliability sets the upper limit of validity, meaning that a test cannot be valid unless it is reliable.
Types of Validity Evidence
- Various types of validity evidence include:
- Content validity: The extent to which a test covers the representative content of the subject it aims to measure.
- Construct validity: The degree to which a test accurately measures a theoretical construct or trait.
- Predictive validity: The extent to which a test can predict future performance or outcomes, though a common problem is range restriction.
- Tests may possess both reliability and validity but could still lack practical utility.
Group Differences and Test Bias
- Differences observed between groups do not necessarily indicate the presence of test bias.