Intelligence Testing Notes

Intelligence Testing Notes

Lecture Overview

  • Prevailing models:
    • Cattell-Horn-Carroll (CHC) theory
    • Multiple Intelligences
    • Triarchic theory
    • PASS model
  • Cultural bias
  • High stakes decisions
  • Ethical issues
  • History of IQ testing, development, reliability, measurement.
  • Clinical and beneficial uses of IQ tests.
  • Definition of intelligence.
  • Risks of IQ tests.

What is Intelligence?

  • David Wechsler (1944): "The aggregate or global capacity of the individual to act purposefully, to think rationally and deal effectively with his environment."
  • Robert Sternberg (1985): "Mental activity directed toward purposive adaptation to, and selection and shaping of, real-world environments relevant to one’s life."
  • Jagannath Prasad Das (1984): "The ability to plan and structure one’s behavior with an end in view."
  • John Wasserman (2018): "In spite of over a century of research, the study of intelligence remains controversial for its social applications and implications."

Early Concepts

  • "g" factor (Charles Spearman, 1904): Intelligence is a singular construct.
  • Intelligence quotient (Alfred Binet, 1905): A child’s score on a test, divided by their age, multiplied by 100: IQ = \frac{mental\ age}{chronological\ age} \times 100
  • Fluid intelligence (Raymond Cattell, 1963): Abstract reasoning on novel tasks.
  • Crystallized intelligence (Raymond Cattell, 1963): Learned procedures and knowledge.

Major Theories of Intelligence

  • Cattell-Horn-Carroll (CHC) Theory
  • Sternberg’s Triarchic Theory of Successful Intelligence
  • Gardner’s Multiple Intelligences
  • Cognitive Processing Theories (PASS model)

Cattell-Horn-Carroll (CHC) Theory

  • Incorporates Cattell and Horn’s theory of fluid and crystallized intelligence (without g).
  • Based on John Carroll’s (1993) "three-stratum theory" (with g).
  • Developed and confirmed through factor analysis.
  • Currently the most widely accepted theory of cognitive abilities.
  • Strengths:
    • Evidence-based/data-based.
    • Widespread application.
    • Useful in guiding assessment for specific learning disabilities or comprehensive cognitive ability assessments.
    • Ever-evolving.
  • Weaknesses:
    • Complicated and confusing for clinicians and clients.
    • Challenging to view some abilities as relevant to "intelligence."
    • Ever-evolving, making it difficult to know the current version.

Multiple Intelligences

  • Howard Gardner (1983) recognized that there are many ways people show “intelligences” beyond those valued by Western societies.
  • These areas were distinct from each other – not theoretically related.
  • Gardner’s multiple intelligences include:
    • Linguistic intelligence
    • Logical-mathematical intelligence
    • Musical intelligence
    • Visual-spatial intelligence
    • Bodily-kinaesthetic intelligence
    • Naturalist intelligence
    • Interpersonal intelligence
    • Intrapersonal intelligence
  • Strengths:
    • Recognizes a wide range of capabilities/talents, not just "book smarts."
    • Has informed different teaching approaches.
  • Weaknesses:
    • Are these "intelligences" better described as "talents" or "skills"?
    • Are these intelligences truly independent of each other?
    • Limited supporting research evidence (though difficult to measure).
    • Difficult to quantify performance.
    • Often conflated with the prevailing myth of "learning styles."
  • Sternberg (1991): It is very difficult, if not impossible, to quantify performance [on these measures]; assessments take place over extremely long periods of time, and it is questionable whether anything approaching objective scoring is even possible

Sternberg’s Triarchic Theory of Successful Intelligence

  • Sternberg believed schools focus on analytical and memory abilities too much, and not enough on creative and practical abilities; all three needed to function together for someone to use their intelligence successfully.
  • 3 dimensions of success:
    • Componential (internal processes): metacognition, planning/organizing, memory retrieval, knowledge acquisition.
    • Experiential: how well people connect their internal world to external reality – applying insights, synthesizing, dealing with novel problems, automatizing.
    • Contextual: how well people adapt to, select, and shape their environments; "street smarts."
  • Three areas associated with successful intelligence:
    • Analytical abilities: Useful in analyzing and evaluating options; problem-solving skills.
    • Creative abilities: Use of experience in ways that foster insight and new ideas.
    • Practical abilities: Use of tacit knowledge to adapt to changing contexts in everyday life.
  • Strengths:
    • Combines internal aspects of intelligence (e.g., problem-solving and reasoning skills) with external aspects (e.g., experience, practice).
    • Focus on real-world success.
    • Traits can more easily be measured.
    • Suggests that many real-life intelligent decisions are not measured by current standardized tests.
  • Weaknesses:
    • Limited information about how the componential, experiential and contextual dimensions relate to one another.
    • Are these dimensions distinct, or interrelated?
    • Is there a mixing of personality traits (e.g., confidence, sociability) with intelligence?

PASS Theory (Das, Naglieri & Kirby, 1994)

  • An alternative to the idea of g, emphasizing psychological processes.
  • PASS = Planning-Attention-Simultaneous-Successive Processing theory
    • Planning = cognitive control, knowledge, intentionality, self-regulation
    • Attention = focused cognitive activity
    • Simultaneous processing = perception of stimuli as a whole, including the ability to integrate words into a meaningful idea
    • Successive processing = making a decision based on stimuli in a sequence
  • These work together when doing intellectual tasks – some stronger than others, depending on the task
  • Strengths:
    • Based on neuropsychological theory about information processing.
    • Some IQ tests are based on this model (Naglieri’s Cognitive Assessment System – CAS2, and Kaufman’s KABC-II).
    • Factor-analytic studies of performance on the Cognitive Assessment System (CAS) show reasonable support for the PASS model (according to the authors).
  • Weaknesses:
    • Independent authors have claimed that the PASS model is not supported by factor analysis of CAS results.
    • The Planning and Attention factors are highly correlated (r = .99).
    • Are the components identified in the PASS model actually what is being measured in the CAS2?

Important Note on IQ Tests

  • The use and access of standardized IQ tests is restricted to registered psychologists only.
  • Only registered psychologists can purchase the tests, directly from the publisher. They agree to a contract of use, including keeping the test materials secure and confidential.
  • In addition to copyright law, the content of IQ tests are considered trade secrets.

History of IQ Tests

  • Stanford-Binet intelligence scale developed during the era of emerging interest in developmental theory and intelligence theory (early 1900s).
  • World War I (1914 - 1918) created a need to identify which soldiers were capable for different roles, hence the Army Mental Tests were created (1920).
  • David Wechsler developed his first test (the Wechsler Intelligence Scale for Children) by ‘cherry-picking’ the most statistically and clinically useful subtests from several different existing tests; included Verbal IQ and a Performance IQ, and used a deviation-based IQ.
  • Use of IQ tests contributed to forced sterilization, institutionalization, racial segregation, dehumanization, and genocide in the historical context either side of World War II.
  • Must always remember not to use IQ tests as a tool for discrimination and harm, and speak up when they are used this way.

The Flynn Effect

  • Population average IQ scores gradually increase by around 0.33 IQ points per year.
  • First observed when the Raven’s Progressive Matrices were routinely given to all 18-year-old army recruits over a long period of time.
  • Some differences between countries.
  • The increase is stronger for fluid rather than crystallized intelligence.
  • There is some evidence that the Flynn Effect could plateau.
  • Potential causes: education, familiarity with testing conditions generally, changes to family life (technology, smaller families, more learning opportunities).
  • Life expectancy, infant mortality, and height, have also followed this trend.
  • Has some unintended consequences (e.g., accuracy of diagnosis, criminal culpability).

Measurement Terminology

  • Norm-referenced standardized tests
    • Constructed by professional test makers.
    • Normed on a representative sample from the population for which the test is intended.
    • Involve fixed (standardized) procedures for administration and scoring.

Test Norms: How are they Developed?

  • Test items are chosen with good psychometric properties, and which together produce a range of performance (i.e., easier versus more difficult items).
  • The finalized test is administered to an appropriately-sized, representative sample.
  • Individuals’ raw scores are converted into standardized scaled or composite scores.
  • Individual subtest scores usually have a mean of 10 and a standard deviation of 3.
  • Composite scores (e.g. IQ) usually have a mean of 100 with standard deviation of 15.
  • These scores reflect position above or below the mean.
  • Standardized scores are normally distributed.
  • We can compare how an individual is positioned in relation to the representative, standardized sample.

Percentile Ranks

  • Illustrates where an individual falls with respect to the rest of the standardization sample.
  • Uses standard deviations from the mean.

Standardized Testing

  • The results of a test are reliable and valid only to the extent that the test was administered according to standardized procedures.
  • What is standardized:
    • Environmental factors
    • Test instructions
    • Acceptable responses
    • Scoring procedures
    • Test procedures
    • Using norms to convert scores

Test Score, “True Score” and Error

  • Sources of error:
    • Within the test (items don’t perfectly tap into and consistently measure the construct, reliability and validity aren’t perfect)
    • The examiner (deviations from standardized administration and scoring, mistakes)
    • The test-taker (moments of distraction/inattention, poor sleep, feeling stressed about a significant life issue, feeling anxious about test-taking, forgot glasses)
    • The testing environment (fire alarm goes off, noisy children using adjacent corridor causing distraction)
  • We try to reduce error by establishing rapport, following standardized processes, and trying to ensure an appropriate testing environment – but we can never entirely eliminate error.
  • A person’s score on a psychometric test represents a “snapshot” of their performance at that time.
  • Purely theoretically (i.e., not taking into account practice effects), if they were to take the test again and again, their score would fluctuate.
  • This pattern of different scores would be assumed to fall in a normal distribution with the “true” test score being at the peak/ the mean.
  • So, a person’s test score = their hypothetical “true score” plus error.

Structure of Common IQ Tests

  • Each question is called an item – points are awarded for correct/appropriate responses.
  • Individual tasks are called subtests, which are comprised of many items (e.g., Digit Span, Vocabulary).
  • Subtests that are designed to measure aspects of the same broad area (e.g. fluid reasoning) are clustered together into composite scores.
  • A set number of the subtests across the composites are compiled to a summary score, often called the Full Scale IQ (FSIQ)
  • Conceptually, this structure can be understood in alignment with CHC theory (though tests vary in how well they map onto CHC).

What Results Might Look Like

  • Numbers of correct items are added up to raw scores.
  • Raw scores are converted to standard scores by comparing them with normative data for the examinee’s age.
  • Subtest standard scores that measure the same area are added together to make a “sum of scaled scores”, and then this sum is converted to composite scores.
  • The FSIQ isn’t always the “full scale” – on the WISC-V, it is derived from only 7 of the 10 core subtests.
  • When interpreting test scores, we focus on the broadest measure, as it includes a wider range of tasks and is therefore more robust against error.
  • The FSIQ is the most reliable summary of overall ability – if there are highs and lows, interpreting the composites may be more meaningful.
  • Subtests may be considered with caution, and item results are too narrow to interpret, though may give us interesting qualitative observations sometimes.

Use of IQ Tests

  • Part of the diagnostic criteria for Intellectual Developmental Disorder (Intellectual Disability).
  • Can give us an idea of an individual’s cognitive strengths and weaknesses.
  • Identifying giftedness, or even average ability, can indicate when somebody is underachieving.
  • Can be useful in differential diagnosis.
  • In bureaucratic systems, they can enable people to qualify for support where resources are limited.
  • Findings can bring insight to clients and those who support them, and can inform suitable intervention.

Important Note

  • Standardized tests (including IQ tests) should only ever be one small part of an assessment process.
  • Results should never be interpreted without the context you get from interviews, observations, and other collateral data sources.
  • Tests can give us new insights and can confirm our hypotheses with more “objective” data.
  • But diagnoses, interpretations, and recommendations are made based on a combination of professional skills and clinical judgment, not on test results alone, and test results do not have the final say on clinical decision making.

Nonverbal IQ Tests

  • The Army Mental Tests (Beta) consisted of nonverbal tasks to assess recruits with low literacy or English skills (e.g., mazes, picture sequences, puzzles).
  • Truly nonverbal tests were designed to provide a fairer and more valid measure of cognitive ability for people who are disadvantaged on language-loaded tests (e.g. d/Deaf, non-English speakers, speech or language difficulties).
  • Have minimal to no spoken instructions or required responses.
  • However, they measure a narrower set of cognitive abilities than language-loaded tests can.

Examples of Nonverbal IQ Tests

  • Multi-dimensional tests:
    • UNIT2 – Universal Nonverbal Intelligence Test – 2nd Edition
    • Leiter-3 – Leiter International Performance Scale – 3rd Edition
    • CTONI-2 – Comprehensive Test of Nonverbal Intelligence – 2nd Edition
  • Unidimensional tests:
    • WNV – Wechsler Nonverbal Scale of Ability
    • TONI-4 - Test of Nonverbal Intelligence, 4th Edition
  • None of these tests have Australian norms.

UNIT2 Example

  • For ages 5:0 to 21:11
  • In Queensland, most commonly used in schools as it is easier than the Leiter-3, and more comprehensive than some others
  • Administration is completely nonverbal although you use words to settle in to the session and to explain what the gestures are that you are going to be using.
  • Devised based on the concept of fairness: it is language-free, it measures multiple indexes rather than one, there is minimal need for previously acquired knowledge, it has minimal emphasis on timed tasks, and it contains varied response modes.
  • High reliability and validity, including with several populations (cultural, language, Deaf/HoH)
  • Developed with input from representatives of many cultures (expert bias panels)
  • Six subtests, across three domains of cognitive ability (Memory, Reasoning, Quantitative)
  • Each domain has two subtests: one symbolic, and one non-symbolic. Examinees may draw on their existing knowledge or language for symbolic subtests.

Language and Cultural Bias

  • C-LTC (Culture-Language Test Classification) Framework and C-LIM (Culture-Language Interpretative Matrices) by Flanagan et al (2007)