1/91
CLI 2
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
achievement test
A standardized ax designed to evaluate a person’s proficiency in a specific area of knowledge/skill. Focus on what has been learned.
age-based norms
compare an individual’s performance to peers within specific age range
age-equivalent scores
a measure of an individual’s development level on a specific skill/ability, expressed in terms of the chronological age at which an avg person demonstrates that same level of performance
alternative or alternate forms
different versions of the same test designed to measure the same construct…to minimize test-retest memory effects
aptitude test
standardized ax designed to measure an individual’s natural ability or potential to learn and perform in specific domains
basal (see also, double basal)
the point at which an examinee is assumed to have mastered all easier items in a test; starting point for scoring
bias
systematic errors in measurement that can affect validity of test results
ceiling effects
phenomenon where a test reaches its maximum possible score, resulting in a clustering of data at upper limit…no longer able to accurately differentiate between individuals who have reached the max score
chronological-age referencing
measuring an individual’s age in yy/mm/dd since birth
classical test theory
psychometric framework that models observed test scores as the sum of true score and random error, aiming to assess the reliability and validity of psychological and educational tests
cognitive (-age) referencing (see also, mental age referencing)
method of ax-ing an individual’s cognitive abilities and development compared to typical progression in their society
composite score
derived from the combo of 2+ individual scores, typically from multiple tests or variables, to create a single, reliable measure of a latent construct
concurrent validity
assesses how well a new test correlates with an est., valid measure of the same construct when both are administered at the same time
confidence interval
provides a range of values within which the true population parameter likely lies…to give an idea of the reliability of researchers’ estimates and to quantify the uncertainty associated with findings
construct validity
the extent to which a test accurately assesses the theoretical construct it is intended to measure
content validity
the extent to which a test accurately reflects the concept (covers the entire domain) it aims to measure
convergent validity (see also, divergent validity)
the extent to which test scores correlate with scores on other tests that measure the same or similar constructs
correlation
statistical relationship between 2+ variables
criterion-referenced assessment
measures a student’s performance against a set of predetermined criteria or standards > other students’ performance
criterion-related validity
The degree to which scores from a construct ax correlate with a manifestation of that construct in the real world. Evaluates how accurately a test measures the outcome it was designed to measure, by comparing results against an established, trusted standard (criterion)
cut-score
the lowest possible score on a test or ax that a student must achieve to be considered proficient or to pass
diagnosis
the identification of the nature of an illness or other problem by examination of the symptoms
differential diagnosis
the process of identifying the specific condition or disorder a pt is experiencing by systematically considering and ruling out all possible causes or explanations for the observed symptoms
discrepancy formula
used to analyze differences between a person’s cognitive abilities and their academic performance
discriminant validity
the extent to which a measure does NOT correlate strongly with measures of different, unrelated constructs, ensuring that a test measures what it is intended to
distractors
an incorrect option in a MC test item that is not given credit; designed to be plausible but incorrect
distribution
the way values of a variable are spread out over a range of possible outcomes
divergent validity
The extent to which a measure does NOT correlate strongly with measures of different, unrelated constructs. Indicates that the results obtained by a measurement do NOT correlate too strongly with measurements of a similar but distinct trait.
domain-specific measure
a measurement tailored to a particular industry, field, or application, or domain of language (e.g., semantics, morphology)
double basal
additional string of correct items to meet a second basal, then used to count all items below as correct, even if examinee responded incorrectly to some of them
dynamic assessment
evaluates cognitive abilities by focusing on learning potential > static performance. test-teach-retest format (based on Vygotsky’s ZPD)
ecological validity
the extent to which research findings can be generalized to real-world settings
face validity
the extent to which a test appears to measure what it is intended to measure, based on superficial inspection and subjective judgment
false negative
a situation where a test or dx procedure incorrectly reports the absence of a condition etc. when the condition is, in fact, present…failure of DETECTION.
poor sensitivity
false positive
test result incorrectly indicates the presence of a condition, such as a disease, when it is not present
poor specificity
floor effects
a measurement instrument fails to differentiate between individuals or groups at lower end of measurement scale, e.g. minimum possible score is set too high
formative assessment
range of in/formal ax procedures conducted by teachers during learning process to monitor student learning and provide ongoing feedback…to ID strengths and weaknesses, adjust teaching strategies, etc.
frequency distribution
method for organizing data and determining how often each value occurs within a dataset…to ID patterns, trends, and comparisons within data
grade-based norms
standardized measures that indiciate typical/avg performance of students at a particular grade level
inter-examiner reliability
the degree of agreement among different examiners when evaluating same conditions or outcomes
intra-examiner reliability
consistency of evaluations conducted by SAME examiner over multiple instances
IQ test
standardized ax designed to measure a range of cognitive abilities and provide a score that reflects an individual’s intellectual capabilities and potential. Ax logic, reasoning, problem-solving, etc.
item analysis
evaluate the effectiveness and quality of individual test items (questions) within an ax
item response theory
psychometric framework for the design, analysis, and scoring of tests and questionnaires…focuses on relationship between latent traits (unobservable characteristics) and their manifestations in observed responses or performance…to improve accuracy of ax’s by modeling the probability of a specific response based on individual’s latent traits
likelihood ratio
statistical measure used to evaluate the effectiveness of a dx test…compares the likelihood of a given test result in pts with a specific condition, to the likelihood of that same result in pts without the condition. Higher LR = greater probability of disease being present
Likert scale
survey tool used to measure attitudes, opinions, behaviors by indicating level of dis/agreement with a statement, often on a 5- or 7-point scale
mean
the number you get by dividing the sum of a set of values by the number of values in the set
median
the value that is sequentially in the middle in a set of numbers
mental age referencing
measures an individual’s level of mental development relative to others
mode
which value appears the most in a set of numbers
normal distribution
symmetric around the mean. mean, median, and mode are all equal and located at the center of the distribution. bell-shape
normal curve
a symmetrical probability distribution in statistics. half the data falls to the left of the mean, half to the right: evenly distributed
normative sample
carefully selected group of individuals whose test performances est. the benchmark against which all other test-takers’ performances are compared
norm-referenced assessment/test
compares a student’s performance to that of their peers, ranking on a bell curve to determine relative standing
norms
the score distribution in a representative sample, providing the standard frame to compare individual scores
omnibus measure
a statistical test that ax’s the overall significance of multiple conditions or variables. used to determine if there are any significant differences among groups or variables without specifying which specific differences exist.
operational definition
specifies how a concept or variable is measured or observed in research…translates abstracts constructs like “intelligence” into observable and quantifiable variables
percentile (percentile score, percentile rank)
score: a statistical measure that indicates the position of a score within a distribution. the value below which a certain percentage of observations fall
rank: the percentage of individuals in a reference group who scored lower than a particular individual on a test. ex. a score in 90th percentile = 90% of scores are below that
population
the entire group of individuals or entities that share specific characteristics
predictive validity
a measure of how well a test can predict future outcomes based on its scores
predictor
an independent variable in an experimental/statistical model that is used to approximate, estimate, or forecast future performance/outcome
progress monitoring assessment
systematic approach in education to evaluate student performance and measure academic growth over time, allowing educators to make data-driven decisions to enhance instruction
psychometrician
an expert in or practitioner of psychometry or psychometrics
range
the difference between the highest and lowest values in a dataset. simple measure of variability
raw score
the direct outcome of a test taker’s responses before any transformation. typically a simple sum or count of item scores. forms the numerical starting point for psychometric interpretation.
regression to the mean
statistical phenomenon where extreme measurements are likely to be followed by values closer to the average
regression equation
a mathematical model used to predict the outcome of a dependent variable based on one or more independent variables. typically a line of best fit.
reliability co-efficient
a numerical index, usually between 0 and 1, that summarizes how consistently a test measures a construct across items, occasions, forms, or raters. higher values = a larger proportion of observed score variation reflects stable differences between individuals rather than random measurement error.
nominal scale
a measurement scale used to categorize variables without implying any quantitative value or order (ex. eye colors)
ordinal scale
a type of measurement scale that categorizes and ranks data in a specific order, without indicating precise differences between the ranks (ex. very satisfied, satisfied, dissatisfied)
scaled score
a standardized score derived from a raw score, allowing for fair comparisons across different test forms and populations
screening
the standardized measurement of individuals’ cognitive abilities, personality traits, and behavioral styles through various tests. brief process that identifies immediate and current health needs, determines need for further evaluation, and is quick to administer and score
sensitivity
a test’s ability to correctly identify individuals who have a specific condition
specificity
a test’s ability to correctly identify individuals who do NOT have a particular condition or trait (identify true negatives, minimize false positives)
split-half reliability
a measure of internal consistency to assess the reliability of a test/survey…divide test into 2 halves and compare results to determine if items measure the same underlying construct.
standard deviation
quantifies the amount of variation/dispersion of a set of data values. low SD: data points tend to be close to the mean. high SD: data points are spread out over wider range of values.
standard error of measurement (SEM)
quantifies the amount of error in individual test scores, reflecting the precision of psychological ax’s and helping to interpret the reliability of test results
standard scores
derived from an individual’s raw scores…describe the difference of the raw score from a sample mean, expressed in SDs. preserve the absolute differences between scores; used to illustrate individual strengths and weaknesses on a measure.
standardization
the process of developing and using uniform procedures for administering, scoring, and interpreting psychological tests, ensuring consistency and comparability of results across different individuals and contexts
standardized
the process of establishing consistent procedures and scoring methods for psychological tests. administered to a representative sample, allowing for objective comparison of individual performance across different contexts
stanine score
a standardized scoring system to convert raw test scores into a more interpretable number. range from 1 to 9, mean of 5, SD of 2.
subtest score
a score derived from a specific section of a standardized test designed to measure a particular skill, knowledge domain, or cognitive process. provide focused data re: an individual’s performance in a narrowly defined area
summative assessment
the evaluation of student learning at the conclusion of an instructional period
systematic bias
a systematic error that can occur at any stage of research process, affecting reliability and validity of the findings
test-retest reliability
measures the consistency of a measurement tool over time. ensures that a test produces stable and repeatable results under the same conditions
true negative
the outcome when a model correctly predicts the absence of a condition or class
true positive
a test correctly identifies a positive case (detects the condition when it is truly present)
true score
the actual, error-free measurement of a participant’s ability or trait, which cannot be directly observed due to various sources of measurement error
X = T + E (observed score = true score + error)
t-scores
used to evaluate an individual’s performance relative to a defined population average. transform raw scores into a common scale; mean = 50, SD = 10…allows for meaningful comparisons across different tests and populations
validity
the extent to which an ax accurately measures what it claims to measure
variance
avg of the squared deviations from the mean…captures how much scores fluctuate around a central value. provides basis for many familiar concepts such as SD, reliability coefficients, effect sizes, model fit measures
z-score
indicates how many SDs a raw score is from the mean of its distribution. positive z-score: score = score is above the mean. negative z-score = score is below the mean. used to compare scores across different tests and populations