Research Methods I - Reliability

Research Methods I

Reliability

  • Variability will be addressed again.

  • Exercises.

Acquiring Basic Psychometric Information from a New Test

  • A simulated dataset from a newly developed test measuring the level of negative affect (high score = bad) has been administered to 500 randomly selected individuals.

  • Analyze the dataset to determine if the scores follow a normal distribution.

    • In SPSS: Analyze -> Explore -> Put the variable in the “dependent list” -> Click “plots” and select “Histogram”.

  • Identify and exclude outliers before proceeding.

    • Detect outliers using SPSS: Analyze -> Explore -> Click “statistics”.

Outliers

  • An outlier was identified and excluded from the analysis.

Mean and Standard Deviation

  • Mean (M) = 49.92

  • Standard Deviation (SD) = 7.65

Interpreting Scores

  • A simulated dataset from an already validated and reliable test measuring anxiety level (high score = bad) has been administered to 500 randomly selected individuals.

  • Client X scored 55 on the affect test and 105 on the anxiety test.

  • Assuming both tests are valid, determine which score is more concerning if scores exceeding the 90th percentile indicate complaints.

Score Interpretation

  • Test for Affect

    • M = 49.92, SD = 7.65

    • Obtained score = 55

    • Difference from mean is approximately 5.

    • Approximately 25% of the reference group has a higher score.

  • Test for Anxiety

    • M = 100.14, SD = 3.01

    • Obtained score = 105

    • Difference from mean is approximately 5.

    • Only approximately 5% of the reference group has a higher score.

Z-score and T-score

  • Z score affect =(5549.924)/7.647=0.66= (55 – 49.924)/7.647 = 0.66

  • T score affect =10z+50=56.6= 10*z+50 = 56.6

  • Z score anxiety =(105100.141)/3=1.62= (105 – 100.141) / 3 = 1.62

  • T score anxiety =66.2= 66.2

Reliability

  • Reliability refers to the consistency or precision of a measurement.

  • What impacts reliability?

    • Measurement error

    • Measurement error is any fluctuation in scores that results from factors related to the measurement process that are irrelevant to what is being measured

Correlations

  • Correlation entails the relationship between two.

  • What is correlation?

  • Graphs depicting different correlations, need to determine which correlation is higher.

  • A, B, C (examples of correlations)

True Score

  • A true score is a hypothetical score entirely free of error.

  • An obtained score reflects:

    • X<em>0=X</em>true+XerrorX<em>0 = X</em>{true} + X_{error}

  • Variability:

    • S<em>0=S</em>true+SerrorS<em>0 = S</em>{true} + S_{error}

Common Sources of Error

  • Context of testing

  • Test taker

  • Test

Sources of Measurement Error and Reliability Coefficients

Source of Error

Type of Tests Prone to Each Error Source

Appropriate Measures Used to Estimate Error

Interscorer differences

Tests scored with a degree of subjectivity

Scorer reliability

Time sampling error

Tests of relatively stable traits or behaviors

Test-retest reliability (r) a.k.a. stability coefficient

Content sampling error

Tests for which consistency of results, as a whole, is desired

Alternate-form reliability (r) or split-half reliability

Interitem inconsistency

Tests that require inter-item consistency

Split-half reliability or Kuder-Richardson 20 (K-R 20)

Interitem inconsistency and content heterogeneity combined

Tests that require inter-item consistency and homogeneity

Internal consistency measures

Time and content sampling error combined

Tests that require stability and consistency of results, as a whole

Delayed alternate-form reliability

Split-Half Reliability

  • Why?

    • To assess inter-item consistency.

  • How?

    • Correlate scores of participants on half of the test with the other half.

  • But…

    • Then we only have the correlation for half of the test.

    • We cannot simply extrapolate this.

    • So…

Cronbach’s Alpha

  • Why?

    • To assess inter-item / internal consistency.

  • How?

    • Perform all the possible split-half analyses on the dataset, and average the rhhr_{hh}\'s à Chronbach’s alpha…

    • Or use the formulate and calculate by hand: https://www.youtube.com/watch?v=JkOiLUZkutc&t=273s

    • Or simply use SPSS/R

Item Reliability

  • Good item?

    • We want to throw away items that do not correlate well (go together) with the total score on a test.

Exercise

  • 200 students filled out a questionnaire assessing their level of Nescafe addiction.

  • The questionnaire is still under development, and this was a first pilot.

  • Items can be scored from 0 to 10.

  • Please investigate the dataset, and perform a psychometric analysis.

  • Specifically, perform the following analyses:

    • 1) Are there any outliers? Can you exclude them from further analysis? If so, please do exclude them.

    • 2) Report the mean score of the scale (what did participants score on average?) and standard deviation?

    • 3) Do you think it is a reliable test? Why?

    • 4) Would you keep all the items, or would you exclude items from the test?

    • 5) Assume that people > 90th percentile usually present with Nescafe addiction, should we worry for participant “40”?