Reliability and Validity

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/39

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

40 Terms

1
New cards

Reliability

  • dependency

  • consistency of measurement

  • Reliability coefficient

  • High_____ is a prerequiste of validity

  • _______increases with test length

2
New cards

Reliability coefficient

  • an index of reliability, a proportion that indicates the ratio between the true score variance on a test and the total variance

3
New cards

Reliability estimates

  • test-retest reliability

  • Parallel forms reliability

  • Internal consistency reliability

  • Inter-rater reliability

4
New cards

Test-retest reliability

  • an estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test

  • Coefficient of stability

  • The longer the time that passes, the grater the likelihood that the reliability coefficient will be lower

  • If the duration of the test-retest is too short, there is a tendency for Carryover effect/practice effect

  • Problem: It is not applicable for states (less enduring characteristics of a person)

  • Applicable for Trait test (last long, long enduring)

  • Pearson r or Spearman rho

5
New cards

Coefficient of stability

how stable is the construct or measure

6
New cards

Parallel Foms and Alternate Forms reliability

  • Item Samping

  • The consistency of test results between two different —but equivalent —forms of a test

  • coefficient of equivalence

  • the advantage of having another form is it eliminates carryover effect

  • Pearson r or Spearman rho

7
New cards

Parallel forms

  • ________for each form of the test, the mean and the variances of observed test scores are equal

  • Same items, different positionings/numberings

8
New cards

Alternate forms

  • are simply different versions of a test that have been constructed so as to be parallel

9
New cards

Internal consistency reliability

  • defines measurement error strctly in terms of consistency or inconsistemcy in the content of the test

  • Split half reliability estimate

  • Spearman Brown Formula

  • Cronbach Coefficient Alpha

  • Kruder Richardson Formula

10
New cards

Split-half reliability estimate

  • obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once

  • odd-even reliabilty

11
New cards

Three steps

  1. Divide the test into equivalent halves

  2. Calculate a Pearson r between scores on the two halves of the test

  3. Adjust the half-test reliability using the Spearman-Brown formula

12
New cards

Spearman Brown Formula

  • allows a test developer or user to estimate internal consistency reliability from a correlation of two halves of a test

  • Estimate the effect of the shortening on the test’s reliability

  • used to determine the number of items needed to attain

13
New cards

Cronbach’s Coefficient alpha

  • used with ratio or interval data

  • nondichotomous items

  • Mean of all possible split half correlations

  • Preferred statistic for obtaining an estimate of internal consistency reliability

  • Typically ranges in value from 0 to 1

14
New cards

Kruder-Richardson Formula

  • used for test with dichotomous items, primarily those items that can be scored right or wrong (such as multiple choice items)

15
New cards

KR 20

  • useful in terms of evaluating the internal consistency of highly homogenous items

  • used for inter-item, consistency of dichotomous

16
New cards

Inter-item consistency

refers to the degree of correlation among all the items on a scale

Test homogeneity - measure single trait

Test heterogeneity - measure different factors

17
New cards

multipoint item

Pearson r between equivalent test halves with Spearman-Brown correction or Kuder-Richardson for dichotomous items, or coefficient alpha for________

18
New cards

Inter-scorer reliability

  • the degree of agreement or consistency between two or more scorers (or judges or raters) with regard to a particular measure

  • scorer reliability, judge reliability, observer reliability, and inter-rater reliability

  • coefficient of inter-scorer reliability

  • Pearson r or Spearman rho

19
New cards

higher

the______the reliability of a selection test the better

20
New cards

0.7

the minimum satisfying figure for test reliability is ____

21
New cards

.80

a reliability coefficient of _____indicates that 20% of the variability in test scores are due to measurement error

22
New cards

Validity

  • the agreement between a test score or measure and the quality it is believed to measure

  • As applied to a test, is a judgment or estimate of how well a test measures what it purports to measure in a particular context.

    • More specifically, it is a judgment based on evidence about the appropriateness of inferences drawn from test scores.

  • The process of gathering and evaluating evidence about validity.

    • Validations studies (i.e. local validation studie

23
New cards

Logic of validity analysis

a valid test is one that

  • predicts future performance on appropriate variables

  • measures an appropriate domain

  • measures appropriate characteristics of test takers

Validity is determined by the relationship between test scores and some other variable referred to as the validation measure

24
New cards

Validity: Trinitarian model

  • content validity

  • criterion-related validity

  • construct validity

25
New cards

3 approaches to assessing validty

  • Scrutinizing the test’s content

  • Relating scores obtained on the test to other test scores or other measures

  • Executing a comprehensive analysis of

    • How scores on the test relate to other test scores and measures

    • How scores on the test can be understood within some theoretical framework for understanding the construct that the test was designed to measure

26
New cards

Face validity

  • not a true measure of validity

  • no evidence

  • a test appears to measure to the person being tested than to what the test actually measures

27
New cards

Content validity

describes a judgment of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample

Two concepts

  • Construct under-representation

  • Construct-irrelevant variance

28
New cards

Construct under-representation

  • Failure to capture important components of the construct

29
New cards

Construct-irrelevant variance

  • scores are influenced by factors irrelevant to the construct

30
New cards

Criterion validity

test score can be used to infer an individual’s most probable standing

How a test correspond to a particular criterion

  • Predictive

    • Predictor and criterion

  • Concurrent

31
New cards

Predictive Validity

  • Measures of the relationship between test scores and a criterion measure obtained at a future time

  • Researchers must take into consideration the base rate of the occurrence of the variable, both as that variable exists in the general population and as it exists in the sample being studies

32
New cards

Concurrent validity

  • If the test scores are obtained at about the same time as the criterion measures are obtained

  • Extent to which test scores may be used to estimate an individual’s present standing on a criterion

  • Economically efficient

33
New cards

Validity coefficient

  • Relationship between a test and a criterion

  • Tells the extent to which the test is valid for making statements about the criterion

34
New cards

Construct validity

  • something built by mental synthesis

  • Involves assembling evidence about what a test means

    • Show relationship between test and other measures

judgement about the appropriateness of inferences drawn from test scores regarding individual standing on variable

  • Convergent Evidence

  • Discriminant Evidence

35
New cards

Convergent evidence

  • Correlation between two sets believed to measure the same construct

36
New cards

Discriminant evidence

  • divergent validation

  • the test measures something unique

  • low correlations with unrelated constructs

37
New cards

Evaluating validity coefficient

  • Look for changes in the cause of the relationship

  • What does the criterion mean?

    • The criterion should be valid and reliable

  • Review the subject population in the study

    • Is the sample size adequate?

  • Do not confuse the criterion with the predictor

  • Is there variability in the criterion and the predictor?

  • Is there evidence for validity generalization?

  • Consider differential prediction

38
New cards

Relationship between validity & reliability

  • Reliability: ability to produce consistent scores that measure stable characteristics

  • Validity: which stable characteristics the test scores measures

  • It is theoretically possible to develop a reliable test that is not valid. If a test is not reliability, its potential validity is limited.

39
New cards

Convergent result

  • significant

  • positive direction

40
New cards

DIvergent result

  • not significant

  • negative direction