1/11
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai | Chat |
|---|
No analytics yet
Send a link to your students to track their progress
Inter-Item Consistency
refers to the degree of correlation among all the items on a scale •
An index of inter-item consistency is useful in assessing the homogeneity of the test
Test Homogenity
Tests are said to be homogenous if they contain items that measure a single trait
In contrast to test homogeneity, heterogeneity describes the degree to which a test measures different factors , more than one trait
Because a homogeneous test samples a relatively narrow content area, it is to be expected to contain more inter-item consistency than a heterogeneous test
Test homogeneity is desirable because it allows relatively straightforward test-score interpretation
Testtakers with the same score on a homogeneous test probably have similar abilities in the area tested
Testtakers with the same score on a more heterogeneous test may have quite different abilities
Kuder-Richardson Formula KR-20
Where test items are highly homogeneous, KR-20 and split-half reliability estimates will be similar
KR-20 is the statistic of choice for determining the internal consistency of dichotomous items, primarily those items that can be scored right or wrong (such as multiple-choice items)
Many modifications. Most popular is A Coefficient
Coefficient Alpha
coefficient alpha is appropriate for use on tests containing non dichotomous items
the preferred statistic for obtaining an estimate of internal consistency reliability
Coefficient alpha typically ranges in value from 0 to 1, helping to answer the question how similar sets of data are
Similarity is gauged, in essence, on a scale from 0 (absolutely no similarity) to 1 (perfectly identical)
Values of alpha above .90 may be “too high” and indicate redundancy in the items
the coefficient alpha provides a measure that is loosely equivalent to the average of all possible split-half reliability coefficients
Internal Consistency and Testtakers Characteristics
All indexes of reliability, coefficient alpha among them, provide an index that is a characteristic of a particular group of test scores, not the test itself
If a new group of testtakers is sufficiently different from the group of testtakers on whom the reliability studies were done, the reliability coefficient may not be the same as the previously reported one
Inter-Scorer Reliability
the degree of agreement or consistency between two or more scorers
If the reliability coefficient is very high, the prospective test user knows that test scores can be derived in a systematic
coefficient of inter-scorer reliability= degree of consistency among scorers in the scoring
Nature of the Test – Homogeneity VS Heterogeneity
A test is said to be homogeneous in items if it is functionally uniform throughout
Tests designed to measure one factor, such as one ability or one trait, are expected to be homogeneous in items. For such tests it is reasonable to expect a high degree of internal consistency
By contrast, if the test is heterogeneous in items, an estimate of internal consistency might be low relative to a more appropriate estimate of test-retest reliability
Nature of the Test – Dynamic VS Static
A dynamic characteristic is a trait, state, or ability presumed to be ever changing as a function of situational and cognitive experiences (e.g. the dynamic characteristic of anxiety)
In the case of dynamic characteristics the best estimate of reliability could be obtained from an internal consistency measure
A static characteristic is a trait, state, or ability presumed to be relatively unchanging (e.g. intelligence). In this instance, either the test-retest or alternate forms method would be appropriate
Nature of the Test – Restriction VS Inflation Range
If the variance of either variable in a correlational analysis is restricted by the sampling procedure used , then the resulting correlation coefficient tends to be lower
If the variance of either variable in a correlational analysis is inflated by the sampling procedure , then the resulting correlation coefficient tends to be higher
Nature of the Test – Speed VS Power Tests
When a time limit is long enough to allow testtakers to attempt all items, and if some items are so difficult that no testtaker is able to obtain a perfect score, then the test is a power test
By contrast, a speed test generally contains items of uniform level of difficulty so that when given generous time limits, all testtakers should be able to complete all the test items correctly
The time limit on a speed test is established so that few if any of the testtakers will be able to complete the entire test
Score differences on a speed test are therefore based on performance speed because items attempted tend to be correct
Reliability of Speed Tests
A reliability estimate of a speed test should be based on performance from two independent testing periods using one of the following: • test-retest reliability • alternate-forms reliability or • split-half reliability from two separately timed half tests.
reliability of a speed test should reflect the consistency of response speed, the reliability of a speed test should not be computed from a single administration of the test with a single time limit
If a speed test is administered once and some measure of internal consistency is computed, like the Kuder-Richardson or a split-half correlation, the result will be a spuriously high reliability coefficient
Nature of the Test – Criterion Referenced Tests
designed to provide an indication of where a testtaker stands with respect to some criterion such as an educational or a vocational objective
Scores on criterion-referenced tests tend to be interpreted in pass/fail or “master/ failed-to-master” terms
how different the scores are from one another is seldom a focus of interest. The critical issue for the user of a mastery test is whether or not a certain criterion score has been achieved • Therefore, traditional procedures for estimating reliability are usually not appropriate