RM1 Step 3b: Operationalise your variable (reliability)

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/32

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

33 Terms

1
New cards

3 types of reliability:

  1. Across time

  2. Across raters

  3. Across items within the test

2
New cards

Random error

May not get identical scores when repeating a measure on the same person due to:

  • Participant-driven error: mood, hunger, fatigue

  • Environment-driven error: temperature, noise, time of the day

3
New cards

2 procedures to establish Reliability Across Time:

  1. Test-retest reliability

  2. Parallel-forms reliability

4
New cards

Test-retest reliability

The extent to which scores on identical measures correlate with e/o when administered at two different times

5
New cards

General procedure

  1. Administer the test

  2. Get the results

  3. Have the interval/time gap

  4. Repeat step 1&2

  5. Correlate both results

6
New cards

Significance of r-value

The more reliable; same scores observed again and again, the higher the r-value (from 0.00-1.00)

7
New cards

Limitation of TRR

The completion of the 1st test may influence one’s knowledge in completing the second test

Eg. it is easier to complete an IQ test the 2nd time when we already know the questions

8
New cards

To fix that limitation, we may use:

Parallel-forms reliability

9
New cards

Parallel-forms reliability

The extent to which scores on similar, but not identical, measures correlate with e/o when administered at two different times

10
New cards

General procedure

  1. Administer the test (form A)

  2. Get the results

  3. Have the interval/time gap

  4. Administer the other test (form B)

  5. Get the results

  6. Correlate both results

11
New cards

Limitations of PFR

  1. Expensive to create double the number of tests

  1. Difficult to make sure the two tests are equivalent

12
New cards

Reliability estimates are only meaningful when:

the construct does not change over time

13
New cards

Example: what is the significance of low reliability in measuring children’s IQ at age 5 and 10?

Low reliability (r= 0.5) may not be meaningful as the difference in IQ scores could be due to changes in intelligence from age 5 vs 10

14
New cards

Why are changes in intelligence not considered random error?

Because intelligence is the construct we are measuring, and there is a genuine change in the construct between age 5 and 10, that is not due to random error

15
New cards

Example #2: low reliability with BFLM scale administered to a 1 year r/s couple and a 10 year r/s couple

Low reliability (r= 0.3) could be due to the fact that love can change after 10 years

16
New cards

What time interval to choose then?

Choose a time interval that makes sense, depending on CONTEXT and what we are measuring

17
New cards

Example: 3 day time interval for BFLM scale and found low TRR

Since love cannot change so quickly, low TRR is more likely due to random error > consider to revise

18
New cards

When is appropriate to use short/long intervals?

Long: when the construct is more resistant to change eg. personality 

Short: when the construct is more susceptible to change eg. moods

19
New cards

Possible limitation of short intervals

Susceptible to low TRR

Eg. 5 min interval between eating two plate of Hokkien mee, the low consistency in taste could be due to fullness 

20
New cards

Two ways to improve TRR & PFR:

  1. Revise your measurement to increase resistance to random error

  • remove subjective questions, or make them more specific to avoid multiple interpretations

  1. Administer the measure more times (across the day) and aggregate the scores together 

  • over a series of measurements, the inconsistencies in scores caused by random error should be averaged to 0

21
New cards

Reliability across raters; inter-rater reliability

The extent to which the ratings of one or more judges correlate with e/o

22
New cards

Why do we need inter-rater reliability (IRR)?

Observer error may arise as raters may differ in moods, attention, motivation, interests etc

23
New cards

3 ways to improve low IRR:

  1. Train your raters and provide clearer guidelines for ratings

  2. Revise your scale (similar to TRR & PFR)

  3. Have more no. of raters before aggregating the scores

  • the overestimates and underestimates caused by observer error should be averaged to 0

24
New cards

Reliability across items within the test; internal consistency

The extent to which the scores of the different items on a scale correlate with e/o, thus measuring the true score 

25
New cards

Why do we need internal consistency?

Most measures consist of >1 item to fulfil content validity, and each item is assumed to measure a part of the total construct

26
New cards

2 methods to calculate internal consistency:

  1. Split-half reliability 

  2. Cronbach’s alpha 

27
New cards

Split-half reliability

The extent to which scores between two halves of a scale are correlated

28
New cards

Procedure example: Odd-even order

  • One half consists of items 1&3, the other half consists of items 2&4

  • If the scores are similar for both halves, then the scale has good SHR

29
New cards

Limitation of using SHR

The 2 halves/versions may not really be equivalent 

30
New cards

To fix that limitation, we may use:

Cronbach’s alpha

31
New cards

Cronbach’s alpha

An estimation of the average correlation among all the items on the scale = the average of all possible SHR outcomes

32
New cards

What is a suitable Cronbach’s alpha score for an acceptable scale?

>0.7

33
New cards

Relationship between reliability and validity

Reliability is a pre-requisite for validity; a measure must first be reliable then it can be valid, but a measure can be reliable without being valid