PSYC CHAPTER 6
Chapter 6: Item Response Theory: The “New” Kid on the Block
- Please note all lectures are being audio recorded.
- All content from Salkind, Tests and Measurement 3e. SAGE Publishing (2018) unless otherwise noted.
The Beginnings of Item Response Theory
- Description:
- Item Response Theory (IRT) is a newer and popular approach to the design, administration, and evaluation of tests.
- IRT is a tool development strategy aimed at maximizing the understanding and measurement of an individual’s “true” ability.
- It examines how well individual test items discriminate between test takers who know and do not know the material.
Understanding Item Response Theory
Core Focus:
- IRT investigates the relationship between performance on each individual item of a test and the underlying ability of the test taker.
- This underlying ability is referred to as a latent trait.
Definition of Latent Trait:
- A latent trait is an ability or characteristic that exists but is not immediately observable or measurable.
Role of the Psychometrician
Responsibilities of a Modern Psychometrician:
- The fundamental task is to accurately estimate an individual’s true underlying latent trait score.
Essential Concept:
- A fundamental aspect of understanding IRT is the Item Characteristic Curve (ICC).
The Item Characteristic Curve (ICC)
- Definition & Importance:
- The ICC visually represents the relationship between the ability of the test taker and their probability of answering an item correctly.
Axes of the ICC
X-Axis (Horizontal):
- Represents the latent or underlying trait/ability that the individual brings to the test item.
- This “amount” of trait is expressed along this axis.
- Average ability is indicated as “0” on the X-axis, with:
- Above-average ability represented to the right.
- Below-average ability represented to the left.
Y-Axis (Vertical):
- Represents the probability of answering the item correctly (P) based on a specific level of ability (q).
- The probability ranges from 0 to 1 or 0% to 100%.
Performance Expectations:
- At the lowest levels of ability, the probability of answering correctly should be very low.
Characteristics of the ICC
- Determining Quality of Test Items:
- The two primary characteristics that help determine whether a test item is effective in discrimination are:
- Difficulty Level:
- Indicated by the location of the curve on the X-axis.
- Items that are further to the right on the X-axis are harder.
- Discrimination Level:
- Indicated by the steepness of the ICC curve.
- A steeper curve indicates better discrimination between various levels of ability.
Evaluating Test Items Using the ICC
Critical Questions:
- Which item is the most difficult?
- Which item is the easiest?
- Comparison of items in terms of discrimination and difficulty.
References:
- Urbina (2014) Essentials of Psychological Testing.
Understanding the Curve: Key Characteristics
- The curve can be better understood through three defined characteristics:
- Discrimination Level (a):
- Represented by the slope of the curve.
- A steeper curve indicates a larger difference between the probability of a correct response for test takers with differing theta values.
- A flatter curve indicates little distinction between test takers regarding ability.
- Difficulty Level (b):
- Identified by the point along the X-axis.
- A high b value indicates a more difficult item.
- If b < 0, the likelihood of a correct response is greater than 0.5.
- If b > 0, the likelihood of a correct response is less than 0.5.
- Probability of Correct Response (c):
- This is the probability that test takers with low ability will answer correctly, even when guessing.
Integration of Characteristics for Test Creation
- IRT in Practice:
- IRT is primarily used in the assessment of test items using the following steps:
- Items created by the test developer.
- Items included in the test administration process.
- IRT evaluates the usefulness of each item individually, assessing how well each item functions.
- Test items are refined, rewritten, and redrafted based on the analysis.
- Items are revised as necessary according to updated values of a, b, and c until they fit the predetermined difficulty and discrimination levels.
IRT and Computerized Adaptive Testing (CAT)
- Functionality:
- In a CAT, the computer possesses information about the difficulty and discrimination levels of test items.
- If an item is assessed as too difficult, the computer can adjust the subsequent item presented to the test taker, possibly selecting an easier one.
- The system dynamically adjusts based on the individual's performance to enhance assessment accuracy of their underlying or latent ability.
- This individualization means that each test taker effectively takes a unique assessment.
Analyzing Test Data Using IRTPRO
- Overview:
- Various computer programs are available for conducting IRT analyses, including IRTPRO.
- Example Scenario:
- Consider a 20-item test taken by 500 participants, where each item score is recorded as either correct (1) or incorrect (0).
Probability Outputs from IRT Analysis
- Sample Output Description:
- The IRT output presents probabilities associated with items and test takers’ abilities denoted as Theta (( \Theta )).
- Graphical Representation:
- Real data representation showing probability correlations as they relate to the levels of ability (Theta).
- Visualize where a “Good” item lies based on these characteristics.
Conclusion
- Summary of Key Learnings:
- Understanding IRT and the ICC is crucial for effective test design and evaluation.
- Mastery of the nuances of difficulty, discrimination, and probability is imperative for modern psychometricians.