1/59
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai | Chat |
|---|
No analytics yet
Send a link to your students to track their progress
Psychological Trait
intelligence, specific intellectual abilities, cognitive style, adjustment, interests, attitudes, sexual orientation and preferences, psychopathology, etc.
Trait
any distinguishable, relatively enduring way in which one individual varies from another. Permit people to predict the present from the past. Characteristic patterns of thinking, feeling, and behaving that generalize across similar situations, differ systematically between individuals, and remain rather stable across time.
States
distinguish one person from another but are relatively less enduring. Characteristic pattern of thinking, feeling, and behaving in a concrete situation at a specific moment in time. Identify those behaviors that can be controlled by manipulating the situation.
Construct
an informed, scientific concept developed or constructed to explain a behavior, inferred from overt behavior.
Overt Behavior
an observable action or the product of an observable action.
Assumption 1
Psychological Traits and States Exist
Assumption 2
Psychological Traits and States can be Quantified and Measured
Once the trait, state or other construct has been defined to be measured, a test developer considers the types of item content that would provide insight into it, to gauge the strength of that trait.
Measuring traits and states by means of a test entails developing not only appropriate test items but also appropriate ways to score the test and interpret the results.
Cumulative Scoring
assumption that the more the testtaker responds in a particular direction keyed by the test manual as correct or consistent with a particular trait, the higher that testtaker is presumed to be on the targeted ability or trait.
Assumption 3
Test-Related Behavior Predicts Non-Test-Related Behavior
The tasks in some tests mimic the actual behaviors that the test user is attempting to understand.
Such tests only yield a sample of the behavior that can be expected to be emitted under nontest conditions.
Assumption 4
Test and Other Measurement Techniques have strengths and weaknesses
Competent test users understand and appreciate the limitations of the test they use as well as how those limitations might be compensated for by data from other sources.
Assumption 5
Various Sources of Error are part of the Assessment Process
Error refers something that is more than expected; it is a component of the measurement process.
Error Variance
the component of a test score attributable to sources other than the trait or ability measured.
Classical Test Theory
each testtaker has a true score on a test that would be obtained but for the action of measurement error.
Assumption 6
Testing and Assessment can be conducted in a Fair and Unbiased Manner
Despite best efforts of many professionals, fairness-related questions and problems do occasionally rise.
In all questions about tests with regards to fairness, it is important to keep in mind that tests are tools—they can be used properly or improperly.
Assumption 7
Testing and Assessment Benefit Society
Considering the many critical decisions that are based on testing and assessment procedures, we can readily appreciate the need for tests.
Reliability
dependability or consistency of the instrument or scores obtained by the same person when re-examined with the same test on different occasions, or with different sets of equivalent items.
More number of items
higher reliability
Reliability Coefficient
index of reliability, a proportion that indicates the ratio between the true score variance on a test and the total variance.
Classical Test Theory (True Score Theory)
score on an ability test is presumed to reflect not only the testtaker's true score on the ability being measured but also the error.
Error
refers to the component of the observed test score that does not have to do with the testtaker's ability.
Goals of Reliability
Estimate errors, and devise techniques to improve testing and reduce errors.
Variance
useful in describing sources of test score variability.
True Variance
variance from true differences
Error Variance
variance from irrelevant random sources.
Reliability
refers to the proportion of total variance attributed to true variance. The greater the proportion of the total variance attributed to true variance, the more reliable the test.
Measurement Error
all of the factors associated with the process of measuring some variable, other than the variable being measured. The difference between the observed score and the true score.
Positive Error
can increase one's score
Negative Error
decrease one's score
Random Error
source of error in measuring a targeted variable caused by unpredictable fluctuations and inconsistencies of other variables in the measurement process (e.g., noise, temperature, weather).
Systematic Error
source of error in measuring a variable that is typically constant or proportionate to what is presumed to be the true values of the variable being measured. Has a consistent effect on the true score. Standard deviation does not change, but the mean does.
Item Sampling/Content Sampling
refer to variation among items within a test as well as to variation among items between tests. The extent to which a testtaker's score is affected by the content sampled on a test and by the way the content is sampled is a source of error variance.
Test Administration
testtaker's motivation or attention, environment, etc.
Test Scoring and Interpretation
may employ objective-type items amenable to computer scoring of well-documented reliability.
Test-Retest Reliability
An estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the test.
Appropriate when evaluating the reliability of a test that purports to measure an enduring and stable attribute such as personality trait.
Established by comparing the scores obtained from two successive measurements of the same individuals and calculating a correlation between the two sets of scores.
The longer the time passes, the greater likelihood that the reliability coefficient would be insignificant.
Carryover Effects
happened when the test-retest interval is short, wherein the second test is influenced by the first test because they remember or practiced the previous test = inflated correlation / overestimation of reliability.
Practice Effect
scores on the second session are higher due to their experience of the first session of testing.
Test Sophistication
items are remembered by the test takers especially the difficult ones / items that we got highly confused.
Test Wiseness
might inflate the abilities of test takers.
Mortality
problems in absences in second session (just remove the first tests of the absents).
Parallel Forms
each form of the test, the means, and the error variances, are EQUAL; same items, different positionings/numberings. True score must be the same for two tests.
Alternate Forms
simply different versions of a test that have been constructed so as to be parallel. Test should contain the same number of items and the items should be expressed in the same form and should cover the same type of content; range and difficulty must also be equal.
Counterbalancing
technique to avoid carryover effects for parallel forms, by using different sequences for groups (e.g., G1 - listen to song before counseling, G2 - counseling first, before listening to the song). Can be administered on the same day or different times.
Internal Consistency (Inter-Item Reliability
Used when tests are administered once. Consistency among items within the test. Measures the internal consistency of the test which is the degree to which each item measures the same construct. Useful measurement for unstable traits.
Homogeneity
if a test contains items that measure a single trait (unifactorial). More homogenous = higher inter-item consistency.
Heterogeneity
degree to which a test measures different factors (more than one factor/trait).
Split-Half Reliability
Obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered ONCE. Useful when it is impractical or undesirable to assess reliability with two tests or to administer a test twice.
Cannot just divide the items in the middle because it might spuriously raise or lower the reliability coefficient, so just randomly assign items or assign odd-numbered items to one half and even-numbered items to the other half.
Spearman-Brown Formula
allows a test developer or user to estimate internal consistency reliability from a correlation of two halves of a test, if each half had been the length of the whole test and have equal variances.
Spearman-Brown Prophecy Formula
estimates how many more items are needed in order to achieve the target reliability. Multiply the estimate to the original number of items. If the reliability of the original test is relatively low, then the developer could create new items, clarify test instructions, or simplify the scoring rules. Equal variances, dichotomously scored.
Rulon's Formula
counterpart of spearman-brown formula, which is the ratio of the variance of difference between the odd and even splits and the variance of the total, combined odd-even, score.
KR-20 (Kuder-Richardson Formula 20)
used for inter-item consistency of dichotomous items (intelligence tests, personality tests with yes or no options, multiple choice), unequal variances, dichotomously scored.
KR-21
if all the items have the same degree of difficulty (speed tests), equal variances, dichotomously scored.
Cronbach's Coefficient Alpha
used when two halves of the test have unequal variances and on tests containing non-dichotomous items, unequal variances.
Average Proportional Distance
measure used to evaluate internal consistencies of a test that focuses on the degree of differences that exist between item scores.
Inter-Scorer Reliability
The degree of agreement or consistency between two or more scorers with regard to a particular measure. Evaluated by calculating the percentage of times that two individuals assign the same scores to the performance of the examinees.
A variation is to have two different examiners test the same client using the same test and then to determine how close their scores or ratings of the person are. Used for coding nonbehavioral behavior observer differences.
Fleiss Kappa
determine the level between TWO or MORE raters when the method of assessment is measured on a CATEGORICAL SCALE.
Cohen's Kappa
two raters only.
Krippendorff's Alpha
two or more raters, based on observed disagreement corrected for disagreement expected by chance.
Dynamic Attribute
trait, state, or ability presumed to be ever-changing as a function of situational and cognitive experience.
Static Attribute
barely changing or relatively unchanging.
Restriction of Range / Restriction of Variance
if the variance of either variable in a correlational analysis is restricted by the sampling procedure used, then the resulting correlation coefficient tends to be lower.