Reliability and Validity

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/39

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

40 Terms

New cards

Reliability

dependency
consistency of measurement
Reliability coefficient

High_____ is a prerequiste of validity
_______increases with test length

New cards

Reliability coefficient

an index of reliability, a proportion that indicates the ratio between the true score variance on a test and the total variance

New cards

Reliability estimates

test-retest reliability
Parallel forms reliability
Internal consistency reliability
Inter-rater reliability

New cards

Test-retest reliability

an estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test
Coefficient of stability
The longer the time that passes, the grater the likelihood that the reliability coefficient will be lower
If the duration of the test-retest is too short, there is a tendency for Carryover effect/practice effect
Problem: It is not applicable for states (less enduring characteristics of a person)
Applicable for Trait test (last long, long enduring)
Pearson r or Spearman rho

New cards

Coefficient of stability

how stable is the construct or measure

New cards

Parallel Foms and Alternate Forms reliability

Item Samping
The consistency of test results between two different —but equivalent —forms of a test
coefficient of equivalence
the advantage of having another form is it eliminates carryover effect
Pearson r or Spearman rho

New cards

Parallel forms

________for each form of the test, the mean and the variances of observed test scores are equal

Same items, different positionings/numberings

New cards

Alternate forms

are simply different versions of a test that have been constructed so as to be parallel

New cards

Internal consistency reliability

defines measurement error strctly in terms of consistency or inconsistemcy in the content of the test
Split half reliability estimate
Spearman Brown Formula
Cronbach Coefficient Alpha
Kruder Richardson Formula

New cards

Split-half reliability estimate

obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once
odd-even reliabilty

New cards

Three steps

Divide the test into equivalent halves
Calculate a Pearson r between scores on the two halves of the test
Adjust the half-test reliability using the Spearman-Brown formula

New cards

Spearman Brown Formula

allows a test developer or user to estimate internal consistency reliability from a correlation of two halves of a test
Estimate the effect of the shortening on the test’s reliability
used to determine the number of items needed to attain

New cards

Cronbach’s Coefficient alpha

used with ratio or interval data
nondichotomous items
Mean of all possible split half correlations
Preferred statistic for obtaining an estimate of internal consistency reliability
Typically ranges in value from 0 to 1

New cards

Kruder-Richardson Formula

used for test with dichotomous items, primarily those items that can be scored right or wrong (such as multiple choice items)

New cards

KR 20

useful in terms of evaluating the internal consistency of highly homogenous items
used for inter-item, consistency of dichotomous

New cards

Inter-item consistency

refers to the degree of correlation among all the items on a scale

Test homogeneity - measure single trait

Test heterogeneity - measure different factors

New cards

multipoint item

Pearson r between equivalent test halves with Spearman-Brown correction or Kuder-Richardson for dichotomous items, or coefficient alpha for________

New cards

Inter-scorer reliability

the degree of agreement or consistency between two or more scorers (or judges or raters) with regard to a particular measure
scorer reliability, judge reliability, observer reliability, and inter-rater reliability
coefficient of inter-scorer reliability
Pearson r or Spearman rho

New cards

higher

the______the reliability of a selection test the better

New cards

0.7

the minimum satisfying figure for test reliability is ____

New cards

.80

a reliability coefficient of _____indicates that 20% of the variability in test scores are due to measurement error

New cards

Validity

the agreement between a test score or measure and the quality it is believed to measure
As applied to a test, is a judgment or estimate of how well a test measures what it purports to measure in a particular context.
- More specifically, it is a judgment based on evidence about the appropriateness of inferences drawn from test scores.
The process of gathering and evaluating evidence about validity.
- Validations studies (i.e. local validation studie

New cards

Logic of validity analysis

a valid test is one that

predicts future performance on appropriate variables
measures an appropriate domain
measures appropriate characteristics of test takers

Validity is determined by the relationship between test scores and some other variable referred to as the validation measure

New cards

Validity: Trinitarian model

content validity
criterion-related validity
construct validity

New cards

3 approaches to assessing validty

Scrutinizing the test’s content
Relating scores obtained on the test to other test scores or other measures
Executing a comprehensive analysis of
- How scores on the test relate to other test scores and measures
- How scores on the test can be understood within some theoretical framework for understanding the construct that the test was designed to measure

New cards

Face validity

not a true measure of validity
no evidence
a test appears to measure to the person being tested than to what the test actually measures

New cards

Content validity

describes a judgment of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample

Two concepts

Construct under-representation
Construct-irrelevant variance

New cards

Construct under-representation

Failure to capture important components of the construct

New cards

Construct-irrelevant variance

scores are influenced by factors irrelevant to the construct

New cards

Criterion validity

test score can be used to infer an individual’s most probable standing

How a test correspond to a particular criterion

Predictive
- Predictor and criterion
Concurrent

New cards

Predictive Validity

Measures of the relationship between test scores and a criterion measure obtained at a future time
Researchers must take into consideration the base rate of the occurrence of the variable, both as that variable exists in the general population and as it exists in the sample being studies

New cards

Concurrent validity

If the test scores are obtained at about the same time as the criterion measures are obtained
Extent to which test scores may be used to estimate an individual’s present standing on a criterion
Economically efficient

New cards

Validity coefficient

Relationship between a test and a criterion
Tells the extent to which the test is valid for making statements about the criterion

New cards

Construct validity

something built by mental synthesis
Involves assembling evidence about what a test means
- Show relationship between test and other measures

judgement about the appropriateness of inferences drawn from test scores regarding individual standing on variable

Convergent Evidence
Discriminant Evidence

New cards

Convergent evidence

Correlation between two sets believed to measure the same construct

New cards

Discriminant evidence

divergent validation
the test measures something unique
low correlations with unrelated constructs

New cards

Evaluating validity coefficient

Look for changes in the cause of the relationship
What does the criterion mean?
- The criterion should be valid and reliable
Review the subject population in the study
- Is the sample size adequate?
Do not confuse the criterion with the predictor
Is there variability in the criterion and the predictor?
Is there evidence for validity generalization?
Consider differential prediction

New cards

Relationship between validity & reliability

Reliability: ability to produce consistent scores that measure stable characteristics
Validity: which stable characteristics the test scores measures
It is theoretically possible to develop a reliable test that is not valid. If a test is not reliability, its potential validity is limited.

New cards

Convergent result