1/29
Definitions, facts, general knowledge
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Rater drift
refers to the phenomenon where a rater's assessments become more lenient or strict over time, thus affecting the consistency of ratings across different evaluation sessions.
Base rate
the prevalence of a trait or characteristic within a specific population, often used in the context of diagnosis or testing; the estimated trait mode in a population.
Coefficient omega
A reliability coefficient that estimates the internal consistency of a test when the statistical assumptions for Cronbach’s alpha are not met. It conceptualizes internal consistency as a ratio of “signal” to “signal plus noise”, based in Signal Detection Theory.
Speed test
A test in which all items are of the same difficulty and the number of items is deliberately large to prevent all of them from being completed in a fixed allotted time frame.
Critical Test Theory
Focuses on the reliability of a measure, defined as one’s “true score” plus or minus some amount of error.
Validity index
A theoretical Pearson r correlation between what the construct attempts to measure and what the test is measuring.
It is typically reported as squared (r2), as this describes the proportion of variance in obtained scores attributable to variance in the construct itself.
Standardization
Measures taken to ensure comparability of scores, maximize reliability, and minimize error.
Responsiveness
The ability of a measure to detect change over time or in different conditions, reflecting its sensitivity to variations in the construct.
Known-groups validity
The extent to which a test can differentiate between groups that are expected to differ on the construct being measured, indicating that the test is valid for that purpose.
Intraclass correlation coefficient
Proportion of variance in obtained ratings (from different raters) attributable to “true score” variance.
With respect to this coefficient, an individual’s “true score” is considered to be the mean of all ratings between different raters.
Orthogonal
Referring to variables that are statistically independent of each other, allowing for separate analysis without confounding effects.
ϴ = 0
Mean ability level in an item characteristic curve.
RMSEA
Root Mean Square Error of Approximation, a measure of how similar a proposed model is to a theoretically “perfect” model.
It ranges from 0.00 - 1.00, where lower is better and ideally below 0.10
Eigenvalue
A measure indicating the amount of variance shared by all of the variables in a given factor. It is computed as the sum of all factor loadings (for all principal components) for a given factor. It is represented by λ.
Sensitivity
The ability of a test to correctly identify true positives, or the proportion of actual positives that are correctly identified.
Specificity
The ability of a test to correctly identify true negatives, or the proportion of actual negatives that are correctly identified.
Guttman item
A type of test item that is structured so that endorsing one item implies endorsement of all preceding items, reflecting a cumulative scale.
Metric invariance
Also called “weak” invariance, suggests:
number of factors is the same across groups
simple structure standard is the same across groups
factor loadings are the same across groups
correlational tests can now be ran on data between groups
Confirms that groups have been measured the same way.
Scaling
The process of transforming raw scores into a standardized form, allowing for comparison across different tests or measurements. Most often, this involves operationalizing a variable to a numerical scale.
Factor score
A quantitative score that represents an individual's position on a latent variable as derived from a factor analysis.
Computed at the product of a factor loading and an obtained score, where the factor loading acts as a standardized regression coefficient (beta weight).
Configural invariance
The weakest form of invariance, suggests:
number of factors is the same across groups
simple structure standard is the same across groups
Confirms similarities in factor structure, but not enough to permit statistical comparisons between groups.
Criterion-referenced test
A test used to measure an individual's performance against a predefined standard or criterion, rather than comparing it to the performance of others.
Behavioral equivalents
Operationally defined behaviors that represent a specific latent construct or trait.
Communality
The proportion of variance in a given variable (across all factors) shared by all other variables in an overall factor solution.
Computed as the (horizontal) sum of squared factor loadings for any given variable.
rxy
The correlation coefficient between exactly one x (predictor) and one y (outcome) variable.
Sliding scale format
A method of measurement where responses are obtained along a continuum, allowing for graded or varying degrees of answers instead of binary, ordinal, or otherwise discrete interval choices.
Reliability
The consistency of a measure, indicating the extent of reproducibility and the extent to which changes in the construct are represented by the measure.
Simple structure standard
A method in factor analysis that ensures each variable loads significantly onto only one factor, promoting clarity and interpretability of the factors.
Differential item functioning
Individuals from different groups respond differently to an item, resulting in different item characteristic curves, despite being of the same ability level. Indicates bias.
Unidimensional test
A test that measures a single construct or trait, ensuring that all items are focused on one underlying dimension.
If developed through factor analysis, a test in which all items load onto the same singular factor.