Reliability II

0.0(0)
Studied by 0 people
call kaiCall Kai
Locked
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/11

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 8:41 AM on 6/18/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai
Chat

No analytics yet

Send a link to your students to track their progress

12 Terms

1
New cards

Inter-Item Consistency

refers to the degree of correlation among all the items on a scale •

An index of inter-item consistency is useful in assessing the homogeneity of the test

2
New cards

Test Homogenity

  • Tests are said to be homogenous if they contain items that measure a single trait

  • In contrast to test homogeneity, heterogeneity describes the degree to which a test measures different factors , more than one trait

  • Because a homogeneous test samples a relatively narrow content area, it is to be expected to contain more inter-item consistency than a heterogeneous test

  • Test homogeneity is desirable because it allows relatively straightforward test-score interpretation

  • Testtakers with the same score on a homogeneous test probably have similar abilities in the area tested

  • Testtakers with the same score on a more heterogeneous test may have quite different abilities

3
New cards

Kuder-Richardson Formula KR-20

Where test items are highly homogeneous, KR-20 and split-half reliability estimates will be similar

KR-20 is the statistic of choice for determining the internal consistency of dichotomous items, primarily those items that can be scored right or wrong (such as multiple-choice items)

Many modifications. Most popular is A Coefficient

4
New cards

Coefficient Alpha

  • coefficient alpha is appropriate for use on tests containing non dichotomous items

  • the preferred statistic for obtaining an estimate of internal consistency reliability

  • Coefficient alpha typically ranges in value from 0 to 1, helping to answer the question how similar sets of data are

  • Similarity is gauged, in essence, on a scale from 0 (absolutely no similarity) to 1 (perfectly identical)

  • Values of alpha above .90 may be “too high” and indicate redundancy in the items

  • the coefficient alpha provides a measure that is loosely equivalent to the average of all possible split-half reliability coefficients

5
New cards

Internal Consistency and Testtakers Characteristics

All indexes of reliability, coefficient alpha among them, provide an index that is a characteristic of a particular group of test scores, not the test itself

If a new group of testtakers is sufficiently different from the group of testtakers on whom the reliability studies were done, the reliability coefficient may not be the same as the previously reported one

6
New cards

Inter-Scorer Reliability

  • the degree of agreement or consistency between two or more scorers

  • If the reliability coefficient is very high, the prospective test user knows that test scores can be derived in a systematic

  • coefficient of inter-scorer reliability= degree of consistency among scorers in the scoring

7
New cards

Nature of the Test – Homogeneity VS Heterogeneity

  • A test is said to be homogeneous in items if it is functionally uniform throughout

  • Tests designed to measure one factor, such as one ability or one trait, are expected to be homogeneous in items. For such tests it is reasonable to expect a high degree of internal consistency

  • By contrast, if the test is heterogeneous in items, an estimate of internal consistency might be low relative to a more appropriate estimate of test-retest reliability

8
New cards

Nature of the Test – Dynamic VS Static

  • A dynamic characteristic is a trait, state, or ability presumed to be ever changing as a function of situational and cognitive experiences (e.g. the dynamic characteristic of anxiety)

  • In the case of dynamic characteristics the best estimate of reliability could be obtained from an internal consistency measure

  • A static characteristic is a trait, state, or ability presumed to be relatively unchanging (e.g. intelligence). In this instance, either the test-retest or alternate forms method would be appropriate

9
New cards

Nature of the Test – Restriction VS Inflation Range

If the variance of either variable in a correlational analysis is restricted by the sampling procedure used , then the resulting correlation coefficient tends to be lower

If the variance of either variable in a correlational analysis is inflated by the sampling procedure , then the resulting correlation coefficient tends to be higher

10
New cards

Nature of the Test – Speed VS Power Tests

When a time limit is long enough to allow testtakers to attempt all items, and if some items are so difficult that no testtaker is able to obtain a perfect score, then the test is a power test

By contrast, a speed test generally contains items of uniform level of difficulty so that when given generous time limits, all testtakers should be able to complete all the test items correctly

The time limit on a speed test is established so that few if any of the testtakers will be able to complete the entire test

Score differences on a speed test are therefore based on performance speed because items attempted tend to be correct

11
New cards

Reliability of Speed Tests

A reliability estimate of a speed test should be based on performance from two independent testing periods using one of the following: • test-retest reliability • alternate-forms reliability or • split-half reliability from two separately timed half tests.

reliability of a speed test should reflect the consistency of response speed, the reliability of a speed test should not be computed from a single administration of the test with a single time limit

If a speed test is administered once and some measure of internal consistency is computed, like the Kuder-Richardson or a split-half correlation, the result will be a spuriously high reliability coefficient

12
New cards

Nature of the Test – Criterion Referenced Tests

designed to provide an indication of where a testtaker stands with respect to some criterion such as an educational or a vocational objective

Scores on criterion-referenced tests tend to be interpreted in pass/fail or “master/ failed-to-master” terms

how different the scores are from one another is seldom a focus of interest. The critical issue for the user of a mastery test is whether or not a certain criterion score has been achieved • Therefore, traditional procedures for estimating reliability are usually not appropriate