1/76
A collection of flashcards focusing on key terms and concepts related to reliability in measurement and testing.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Reliability Coefficient
Index of reliability; a proportion that indicates the ratio between true score variance and total variance.
Classical Test Theory
Assumes a score reflects both true ability and error.
Observed Score (X)
The score that a testtaker actually receives, represented by the formula X = T + E.
True Score (T)
The score that reflects a testtaker's actual ability without error.
Error (E)
The component of the observed score that does not reflect the testtaker's true ability.
Measurement Error
Factors associated with measuring a variable other than the variable itself.
Random Error
Unpredictable fluctuations causing inconsistencies in measurement.
Systematic Error
Consistent error affecting measurements that can be predicted and fixed.
Test Construction
Variation among test items within and between tests that can affect reliability.
Test Administration
Influences during testing such as environment and testtaker variables that may introduce error.
Measurement Variance
Variance can be broken into true variance and error variance.
Test-Retest Reliability
Correlating scores from the same test administered at two different times.
Carryover Effect
Influence of the first test on the results of the second test.
Parallel-Forms Reliability
Degree of relationship between various forms of a test.
Split-Half Reliability
Correlating pairs of scores from equivalent halves of a single test.
Spearman-Brown Formula
Adjusts split-half reliability estimates to account for the test's length.
Internal Consistency
Degree of correlation among items within a test.
Cronbach’s Coefficient Alpha
Estimates internal consistency reliability, especially for nondichotomous items.
Kuder-Richardson Formula 20 (KR-20)
Used for determining the internal consistency of dichotomous items.
Standard Error of Measurement (SEM)
Provides a measure of how much error is inherent in an observed score.
Coefficient of Inter-Scorer Reliability
Degree of agreement between different scorers for the same measure.
Kappa Statistic
Measures agreement between two or more raters, adjusted for chance.
Generalizability Theory
Examines how test scores relate under different testing conditions.
Item Response Theory (IRT)
Models the probability of a person with specific ability performing at a level.
Polytomous Test Items
Items that can be scored with three or more responses.
Dichotomous Test Items
Items that can be answered with one of two alternative responses.
Discrimination
Degree to which an item differentiates among individuals with different traits.
Validity
The extent to which a test measures what it claims to measure.
Response Bias
The tendency of test-takers to respond in a certain way regardless of the content.
Test Validity
Refers to how accurately a test measures what it is intended to measure.
Criterion Validity
The extent to which a measure is related to an outcome.
Construct Validity
How well a test measures a theoretical concept or construct.
Content Validity
The degree to which test items adequately represent the construct being measured.
Reliability Estimate
Calculated to determine the consistency of a test score.
Confidence Interval
A range of values that is likely to contain the true score.
Sample Size Impact on Reliability
Larger sample sizes typically yield more accurate reliability estimates.
Item Analysis
Examines the effectiveness of each test item in measuring the construct.
Factor Analysis
Statistical method used to identify the underlying relationships between variables.
Discrimination Index
Measures how well an item differentiates between high and low scorers.
Attenuation Correction
Adjusting correlations for the effects of measurement error.
Observed Score Variation
Variance in test scores as affected by both true scores and error.
External Validity
The extent to which findings can be generalized to settings and populations outside the study.
Stable Traits
Traits that are relatively unchanging over time.
Dynamic Traits
Traits or abilities that can change over time.
Sampling Error
Error caused by observing a sample instead of the whole population.
True Score Model
Theory positing that individuals have a true score that represents their actual ability.
Domain Sampling Theory
Estimates how specific sources of variation contribute to test scores.
Construct Reliability
Consistency of a measure across different circumstances.
Standardized Tests
Tests that have been normed on a population to ensure reliability and validity.
Test Authoring
Process of creating tests to ensure appropriate measurement of constructs.
Error Variance
Variance in test scores attributed to measurement errors.
Practice Effects
Improvements in test performance due to repeated exposure to test items.
Assessment Methods
Variety of techniques used to measure constructs like personality or ability.
Item Difficulty Level
Level of challenge posed by test items to the test-takers.
Test-taker Variables
Personal factors impacting a test-taker's performance.
Administering Conditions
Conditions under which a test is administered that can affect outcomes.
Quantitative Assessment
Measurement methods that rely heavily on numerical data.
Qualitative Assessment
Measurement approaches focusing on non-numeric data, like interviews.
Final Score Calculation
Process of deriving a test score from observed performances.
Testing Paradigms
Frameworks guiding the design and interpretation of tests.
Statistical Power
The likelihood that a test will correctly reject a false null hypothesis.
P-value
Probability that the observed results would occur by chance if the null hypothesis were true.
Standard Error of Difference
Provides a measure to assess the significance of differences between two scores.
Consistency Across Measures
The extent to which test results are stable across different conditions.
Random Sample
A sample that fairly represents a population due to random selection.
Longitudinal Studies
Research studies that follow the same subjects over a period of time.
Cross-Sectional Studies
Studies that analyze data from a population at a specific point in time.
Behavioral Observations
Assessments based on observing individuals' behavior in various contexts.
Research Ethics
Moral principles guiding researchers in conducting their work.
Test Administration Procedures
Standardized methods for giving tests to ensure fairness and consistency.
Behavioral Checklists
Tools used for rating behaviors based on specific criteria.
Nonverbal Assessment
Evaluation methods that do not rely on verbal responses.
Chronic Conditions
Ongoing health issues that may affect test performance.
Emotional State Impact
Influence of a test-taker's mood on test performance.
Test User Training
Education for test administrators to ensure proper use of tests.
Effect of Instructions on Responses
How guidance given to test-takers can shape their answers.
Reliability in Educational Settings
Importance of consistency in assessments used in academic environments.