Psychometric Terms

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/91

Earn XP

Description and Tags

CLI 2

Last updated 6:28 AM on 3/1/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

92 Terms

New cards

achievement test

A standardized ax designed to evaluate a person’s proficiency in a specific area of knowledge/skill. Focus on what has been learned.

New cards

age-based norms

compare an individual’s performance to peers within specific age range

New cards

age-equivalent scores

a measure of an individual’s development level on a specific skill/ability, expressed in terms of the chronological age at which an avg person demonstrates that same level of performance

New cards

alternative or alternate forms

different versions of the same test designed to measure the same construct…to minimize test-retest memory effects

New cards

aptitude test

standardized ax designed to measure an individual’s natural ability or potential to learn and perform in specific domains

New cards

basal (see also, double basal)

the point at which an examinee is assumed to have mastered all easier items in a test; starting point for scoring

New cards

bias

systematic errors in measurement that can affect validity of test results

New cards

ceiling effects

phenomenon where a test reaches its maximum possible score, resulting in a clustering of data at upper limit…no longer able to accurately differentiate between individuals who have reached the max score

New cards

chronological-age referencing

measuring an individual’s age in yy/mm/dd since birth

New cards

classical test theory

psychometric framework that models observed test scores as the sum of true score and random error, aiming to assess the reliability and validity of psychological and educational tests

New cards

cognitive (-age) referencing (see also, mental age referencing)

method of ax-ing an individual’s cognitive abilities and development compared to typical progression in their society

New cards

composite score

derived from the combo of 2+ individual scores, typically from multiple tests or variables, to create a single, reliable measure of a latent construct

New cards

concurrent validity

assesses how well a new test correlates with an est., valid measure of the same construct when both are administered at the same time

New cards

confidence interval

provides a range of values within which the true population parameter likely lies…to give an idea of the reliability of researchers’ estimates and to quantify the uncertainty associated with findings

New cards

construct validity

the extent to which a test accurately assesses the theoretical construct it is intended to measure

New cards

content validity

the extent to which a test accurately reflects the concept (covers the entire domain) it aims to measure

New cards

convergent validity (see also, divergent validity)

the extent to which test scores correlate with scores on other tests that measure the same or similar constructs

New cards

correlation

statistical relationship between 2+ variables

New cards

criterion-referenced assessment

measures a student’s performance against a set of predetermined criteria or standards > other students’ performance

New cards

criterion-related validity

The degree to which scores from a construct ax correlate with a manifestation of that construct in the real world. Evaluates how accurately a test measures the outcome it was designed to measure, by comparing results against an established, trusted standard (criterion)

New cards

cut-score

the lowest possible score on a test or ax that a student must achieve to be considered proficient or to pass

New cards

diagnosis

the identification of the nature of an illness or other problem by examination of the symptoms

New cards

differential diagnosis

the process of identifying the specific condition or disorder a pt is experiencing by systematically considering and ruling out all possible causes or explanations for the observed symptoms

New cards

discrepancy formula

used to analyze differences between a person’s cognitive abilities and their academic performance

New cards

discriminant validity

the extent to which a measure does NOT correlate strongly with measures of different, unrelated constructs, ensuring that a test measures what it is intended to

New cards

distractors

an incorrect option in a MC test item that is not given credit; designed to be plausible but incorrect

New cards

distribution

the way values of a variable are spread out over a range of possible outcomes

New cards

divergent validity

The extent to which a measure does NOT correlate strongly with measures of different, unrelated constructs. Indicates that the results obtained by a measurement do NOT correlate too strongly with measurements of a similar but distinct trait.

New cards

domain-specific measure

a measurement tailored to a particular industry, field, or application, or domain of language (e.g., semantics, morphology)

New cards

double basal

additional string of correct items to meet a second basal, then used to count all items below as correct, even if examinee responded incorrectly to some of them

New cards

dynamic assessment

evaluates cognitive abilities by focusing on learning potential > static performance. test-teach-retest format (based on Vygotsky’s ZPD)

New cards

ecological validity

the extent to which research findings can be generalized to real-world settings

New cards

face validity

the extent to which a test appears to measure what it is intended to measure, based on superficial inspection and subjective judgment

New cards

false negative

a situation where a test or dx procedure incorrectly reports the absence of a condition etc. when the condition is, in fact, present…failure of DETECTION.

poor sensitivity

New cards

false positive

test result incorrectly indicates the presence of a condition, such as a disease, when it is not present

poor specificity

New cards

floor effects

a measurement instrument fails to differentiate between individuals or groups at lower end of measurement scale, e.g. minimum possible score is set too high

New cards

formative assessment

range of in/formal ax procedures conducted by teachers during learning process to monitor student learning and provide ongoing feedback…to ID strengths and weaknesses, adjust teaching strategies, etc.

New cards

frequency distribution

method for organizing data and determining how often each value occurs within a dataset…to ID patterns, trends, and comparisons within data

New cards

grade-based norms

standardized measures that indiciate typical/avg performance of students at a particular grade level

New cards

inter-examiner reliability

the degree of agreement among different examiners when evaluating same conditions or outcomes

New cards

intra-examiner reliability

consistency of evaluations conducted by SAME examiner over multiple instances

New cards

IQ test

standardized ax designed to measure a range of cognitive abilities and provide a score that reflects an individual’s intellectual capabilities and potential. Ax logic, reasoning, problem-solving, etc.

New cards

item analysis

evaluate the effectiveness and quality of individual test items (questions) within an ax

New cards

item response theory

psychometric framework for the design, analysis, and scoring of tests and questionnaires…focuses on relationship between latent traits (unobservable characteristics) and their manifestations in observed responses or performance…to improve accuracy of ax’s by modeling the probability of a specific response based on individual’s latent traits

New cards

likelihood ratio

statistical measure used to evaluate the effectiveness of a dx test…compares the likelihood of a given test result in pts with a specific condition, to the likelihood of that same result in pts without the condition. Higher LR = greater probability of disease being present

New cards

Likert scale

survey tool used to measure attitudes, opinions, behaviors by indicating level of dis/agreement with a statement, often on a 5- or 7-point scale

New cards

mean

the number you get by dividing the sum of a set of values by the number of values in the set

New cards

median

the value that is sequentially in the middle in a set of numbers

New cards

mental age referencing

measures an individual’s level of mental development relative to others

New cards

mode

which value appears the most in a set of numbers

New cards

normal distribution

symmetric around the mean. mean, median, and mode are all equal and located at the center of the distribution. bell-shape

New cards

normal curve

a symmetrical probability distribution in statistics. half the data falls to the left of the mean, half to the right: evenly distributed

New cards

normative sample

carefully selected group of individuals whose test performances est. the benchmark against which all other test-takers’ performances are compared

New cards

norm-referenced assessment/test

compares a student’s performance to that of their peers, ranking on a bell curve to determine relative standing

New cards

norms

the score distribution in a representative sample, providing the standard frame to compare individual scores

New cards

omnibus measure

a statistical test that ax’s the overall significance of multiple conditions or variables. used to determine if there are any significant differences among groups or variables without specifying which specific differences exist.

New cards

operational definition

specifies how a concept or variable is measured or observed in research…translates abstracts constructs like “intelligence” into observable and quantifiable variables

New cards

percentile (percentile score, percentile rank)

score: a statistical measure that indicates the position of a score within a distribution. the value below which a certain percentage of observations fall
rank: the percentage of individuals in a reference group who scored lower than a particular individual on a test. ex. a score in 90th percentile = 90% of scores are below that

New cards

population

the entire group of individuals or entities that share specific characteristics

New cards

predictive validity

a measure of how well a test can predict future outcomes based on its scores

New cards

predictor

an independent variable in an experimental/statistical model that is used to approximate, estimate, or forecast future performance/outcome

New cards

progress monitoring assessment

systematic approach in education to evaluate student performance and measure academic growth over time, allowing educators to make data-driven decisions to enhance instruction

New cards

psychometrician

an expert in or practitioner of psychometry or psychometrics

New cards

range

the difference between the highest and lowest values in a dataset. simple measure of variability

New cards

raw score

the direct outcome of a test taker’s responses before any transformation. typically a simple sum or count of item scores. forms the numerical starting point for psychometric interpretation.

New cards

regression to the mean

statistical phenomenon where extreme measurements are likely to be followed by values closer to the average

New cards

regression equation

a mathematical model used to predict the outcome of a dependent variable based on one or more independent variables. typically a line of best fit.

New cards

reliability co-efficient

a numerical index, usually between 0 and 1, that summarizes how consistently a test measures a construct across items, occasions, forms, or raters. higher values = a larger proportion of observed score variation reflects stable differences between individuals rather than random measurement error.

New cards

nominal scale

a measurement scale used to categorize variables without implying any quantitative value or order (ex. eye colors)

New cards

ordinal scale

a type of measurement scale that categorizes and ranks data in a specific order, without indicating precise differences between the ranks (ex. very satisfied, satisfied, dissatisfied)

New cards

scaled score

a standardized score derived from a raw score, allowing for fair comparisons across different test forms and populations

New cards

screening

the standardized measurement of individuals’ cognitive abilities, personality traits, and behavioral styles through various tests. brief process that identifies immediate and current health needs, determines need for further evaluation, and is quick to administer and score

New cards

sensitivity

a test’s ability to correctly identify individuals who have a specific condition

New cards

specificity

a test’s ability to correctly identify individuals who do NOT have a particular condition or trait (identify true negatives, minimize false positives)

New cards

split-half reliability

a measure of internal consistency to assess the reliability of a test/survey…divide test into 2 halves and compare results to determine if items measure the same underlying construct.

New cards

standard deviation

quantifies the amount of variation/dispersion of a set of data values. low SD: data points tend to be close to the mean. high SD: data points are spread out over wider range of values.

New cards

standard error of measurement (SEM)

quantifies the amount of error in individual test scores, reflecting the precision of psychological ax’s and helping to interpret the reliability of test results

New cards

standard scores

derived from an individual’s raw scores…describe the difference of the raw score from a sample mean, expressed in SDs. preserve the absolute differences between scores; used to illustrate individual strengths and weaknesses on a measure.

New cards

standardization

the process of developing and using uniform procedures for administering, scoring, and interpreting psychological tests, ensuring consistency and comparability of results across different individuals and contexts

New cards

standardized

the process of establishing consistent procedures and scoring methods for psychological tests. administered to a representative sample, allowing for objective comparison of individual performance across different contexts

New cards

stanine score

a standardized scoring system to convert raw test scores into a more interpretable number. range from 1 to 9, mean of 5, SD of 2.

New cards

subtest score

a score derived from a specific section of a standardized test designed to measure a particular skill, knowledge domain, or cognitive process. provide focused data re: an individual’s performance in a narrowly defined area

New cards

summative assessment

the evaluation of student learning at the conclusion of an instructional period

New cards

systematic bias

a systematic error that can occur at any stage of research process, affecting reliability and validity of the findings

New cards

test-retest reliability

measures the consistency of a measurement tool over time. ensures that a test produces stable and repeatable results under the same conditions

New cards

true negative

the outcome when a model correctly predicts the absence of a condition or class

New cards

true positive

a test correctly identifies a positive case (detects the condition when it is truly present)

New cards

true score

the actual, error-free measurement of a participant’s ability or trait, which cannot be directly observed due to various sources of measurement error

X = T + E (observed score = true score + error)

New cards

t-scores

used to evaluate an individual’s performance relative to a defined population average. transform raw scores into a common scale; mean = 50, SD = 10…allows for meaningful comparisons across different tests and populations

New cards

validity

the extent to which an ax accurately measures what it claims to measure

New cards

variance

avg of the squared deviations from the mean…captures how much scores fluctuate around a central value. provides basis for many familiar concepts such as SD, reliability coefficients, effect sizes, model fit measures

New cards

z-score

indicates how many SDs a raw score is from the mean of its distribution. positive z-score: score = score is above the mean. negative z-score = score is below the mean. used to compare scores across different tests and populations