PSYC CHAPTER 6

Chapter 6: Item Response Theory: The “New” Kid on the Block

  • Please note all lectures are being audio recorded.
  • All content from Salkind, Tests and Measurement 3e. SAGE Publishing (2018) unless otherwise noted.

The Beginnings of Item Response Theory

  • Description:
    • Item Response Theory (IRT) is a newer and popular approach to the design, administration, and evaluation of tests.
    • IRT is a tool development strategy aimed at maximizing the understanding and measurement of an individual’s “true” ability.
    • It examines how well individual test items discriminate between test takers who know and do not know the material.

Understanding Item Response Theory

  • Core Focus:

    • IRT investigates the relationship between performance on each individual item of a test and the underlying ability of the test taker.
    • This underlying ability is referred to as a latent trait.
  • Definition of Latent Trait:

    • A latent trait is an ability or characteristic that exists but is not immediately observable or measurable.

Role of the Psychometrician

  • Responsibilities of a Modern Psychometrician:

    • The fundamental task is to accurately estimate an individual’s true underlying latent trait score.
  • Essential Concept:

    • A fundamental aspect of understanding IRT is the Item Characteristic Curve (ICC).

The Item Characteristic Curve (ICC)

  • Definition & Importance:
    • The ICC visually represents the relationship between the ability of the test taker and their probability of answering an item correctly.

Axes of the ICC

  • X-Axis (Horizontal):

    • Represents the latent or underlying trait/ability that the individual brings to the test item.
    • This “amount” of trait is expressed along this axis.
    • Average ability is indicated as “0” on the X-axis, with:
    • Above-average ability represented to the right.
    • Below-average ability represented to the left.
  • Y-Axis (Vertical):

    • Represents the probability of answering the item correctly (P) based on a specific level of ability (q).
    • The probability ranges from 0 to 1 or 0% to 100%.
  • Performance Expectations:

    • At the lowest levels of ability, the probability of answering correctly should be very low.

Characteristics of the ICC

  • Determining Quality of Test Items:
    • The two primary characteristics that help determine whether a test item is effective in discrimination are:
    1. Difficulty Level:
      • Indicated by the location of the curve on the X-axis.
      • Items that are further to the right on the X-axis are harder.
    2. Discrimination Level:
      • Indicated by the steepness of the ICC curve.
      • A steeper curve indicates better discrimination between various levels of ability.

Evaluating Test Items Using the ICC

  • Critical Questions:

    • Which item is the most difficult?
    • Which item is the easiest?
    • Comparison of items in terms of discrimination and difficulty.
  • References:

    • Urbina (2014) Essentials of Psychological Testing.

Understanding the Curve: Key Characteristics

  • The curve can be better understood through three defined characteristics:
    • Discrimination Level (a):
    • Represented by the slope of the curve.
    • A steeper curve indicates a larger difference between the probability of a correct response for test takers with differing theta values.
    • A flatter curve indicates little distinction between test takers regarding ability.
    • Difficulty Level (b):
    • Identified by the point along the X-axis.
    • A high b value indicates a more difficult item.
    • If b < 0, the likelihood of a correct response is greater than 0.5.
    • If b > 0, the likelihood of a correct response is less than 0.5.
    • Probability of Correct Response (c):
    • This is the probability that test takers with low ability will answer correctly, even when guessing.

Integration of Characteristics for Test Creation

  • IRT in Practice:
    • IRT is primarily used in the assessment of test items using the following steps:
    1. Items created by the test developer.
    2. Items included in the test administration process.
    3. IRT evaluates the usefulness of each item individually, assessing how well each item functions.
    4. Test items are refined, rewritten, and redrafted based on the analysis.
    5. Items are revised as necessary according to updated values of a, b, and c until they fit the predetermined difficulty and discrimination levels.

IRT and Computerized Adaptive Testing (CAT)

  • Functionality:
    • In a CAT, the computer possesses information about the difficulty and discrimination levels of test items.
    • If an item is assessed as too difficult, the computer can adjust the subsequent item presented to the test taker, possibly selecting an easier one.
    • The system dynamically adjusts based on the individual's performance to enhance assessment accuracy of their underlying or latent ability.
    • This individualization means that each test taker effectively takes a unique assessment.

Analyzing Test Data Using IRTPRO

  • Overview:
    • Various computer programs are available for conducting IRT analyses, including IRTPRO.
  • Example Scenario:
    • Consider a 20-item test taken by 500 participants, where each item score is recorded as either correct (1) or incorrect (0).

Probability Outputs from IRT Analysis

  • Sample Output Description:
    • The IRT output presents probabilities associated with items and test takers’ abilities denoted as Theta (( \Theta )).
  • Graphical Representation:
    • Real data representation showing probability correlations as they relate to the levels of ability (Theta).
    • Visualize where a “Good” item lies based on these characteristics.

Conclusion

  • Summary of Key Learnings:
    • Understanding IRT and the ICC is crucial for effective test design and evaluation.
  • Mastery of the nuances of difficulty, discrimination, and probability is imperative for modern psychometricians.