Intelligence Testing: History, Formulas, and Distribution

Historical Roots and Early Theories of Intelligence Testing

The measurement and comparison of human intelligence have roots in the late 19th century, spearheaded by figures such as Francis Galton and Wilhelm Wundt. Their initial approaches were grounded in the study of sensory and perceptual processes. According to their theories, reaction time was viewed as a primary indicator of mental efficiency, while the products of perception and thought were seen as indicators of effectiveness. By the early 20th century, the focus shifted toward "higher mental functions" through the work of Alfred Binet and Theodore Simon. This period introduced a developmental perspective on intelligence, exemplified by Jean Piaget's observations during the first half of the 20th century. Piaget noted that as children age, they are capable of performing increasingly complex tasks, which reflects an increase in cognitive sophistication. Intelligence testing was thus driven by a desire to define intelligence through empirical measurement and to understand why societies are invested in comparing individuals' mental capacities.

The Binet-Simon Scales and the Calculation of Mental Age

In 1905, Alfred Binet and Theodore Simon developed the Binet-Simon Scale with the specific objective of identifying children who faced learning difficulties. Unlike some later interpretations, Binet and Simon did not believe intelligence was fixed; rather, they viewed it as a trait with a constant rate of development. Their testing methodology focused on age-related abilities and problem-solving. By assigning specific test items to certain ages, they could determine a child's Mental Age (MA). This Mental Age was then compared to the child's Chronological Age (CA). A central challenge with this developmental model was that Mental Age typically stops increasing in adolescence, making the comparison to chronological age problematic as an individual enters adulthood. Despite this, the Binet-Simon Scale established the foundational concept of measuring developmental progress through age-normed tasks.

The Stanford-Binet IQ Test and the Quotient Formula

At Stanford University in the United States, Lewis Terman translated and adapted the Binet-Simon Scale to create what is known as the Stanford-Binet IQ Test. This version incorporated both verbal and non-verbal subtests that assessed five specific areas: knowledge, quantitative reasoning, visual-spatial processing, working memory, and fluid reasoning. Terman developed the Intelligent Quotient (IQ) score using a specific mathematical formula: $\text{IQ} = \frac{\text{MA}}{\text{CA}} \times 100$ . For instance, if an eleven-year-old child had a mental age of eleven, the calculation would be $\frac{11}{11} \times 100 = 100$ . Under this system, the average IQ is set at $100$ . However, this model faced criticism because there is no proportional relationship between scores across different ages, and the formula fails once the denominator (CA) continues to rise while the numerator (MA) plateaus in adulthood. Furthermore, the history of the Stanford-Binet is tied to controversial applications, including its use in eugenics and racial profiling.

David Wechsler and the Development of Deviation IQ

David Wechsler addressed the inherent flaws in the ratio-based IQ score by developing the Deviation IQ. Wechsler recognized that measured ability does not continue to increase after the age of approximately $16\,\text{years}$ . Under the old Stanford-Binet formula, if an individual's Mental Age stayed constant while their Chronological Age increased, their IQ score would appear to decline mathematically. Wechsler also argued that the Stanford-Binet ignored the critical roles of learning and environmental factors. His solution was to test large, representative samples of different age groups to determine the average performance (norms) for each specific group. Instead of a ratio, an individual's score is compared against the group norms to determine how far they deviate from the average. This approach led to the creation of the most widely used intelligence assessments in modern psychology, including the Wechsler Adult Intelligence Scale (WAIS-IV), the Wechsler Intelligence Scale for Children (WISC-V), and the Wechsler Preschool and Primary Scale of Intelligence (WPPSI).

The Normal Distribution and Statistical Interpretation of IQ

In modern intelligence testing, scores are interpreted using a normal distribution, often referred to as a bell curve. This system accounts for the fact that different demographic groups may have different average scores based on variables such as age, sex, home language, socio-economic status, and the level or quality of schooling. To standardize comparison, the group average is reset to $100$ with a standard deviation (SD) of $15$ . Consequently, an IQ score is not simply a raw count of correct answers on a test, but a measure of where an individual sits in relation to their group average. Statistically, the distribution of scores follows specific thresholds: $68.2\%$ of the population scores between $85$ and $115$ (within one SD), $95.4\%$ scores between $70$ and $130$ (within two SDs), and $99.7\%$ scores between $55$ and $145$ (within three SDs).

Percentiles and Score Distribution Ranges

The relationship between IQ scores and percentiles provides a clear picture of an individual's relative standing. A score of $100$ represents the $50\text{th}$ percentile, meaning the individual performed better than $50\%$ of the population. A score of $115$ corresponds to the $84\text{th}$ percentile, while a score of $130$ places an individual in the $98\text{th}$ percentile. At the extreme high end, an IQ of $145$ represents the $99.9\text{th}$ percentile. Conversely, a score of $85$ is at the $16\text{th}$ percentile, a score of $70$ is at the $2\text{nd}$ percentile, and a score of $55$ marks the $0.1\text{st}$ percentile. This statistical framework ensures that intelligence is measured as a relative rank within a specific population rather than an absolute, unchanging quantity.