Research Methods I - Reliability
Research Methods I
Reliability
Variability will be addressed again.
Exercises.
Acquiring Basic Psychometric Information from a New Test
A simulated dataset from a newly developed test measuring the level of negative affect (high score = bad) has been administered to 500 randomly selected individuals.
Analyze the dataset to determine if the scores follow a normal distribution.
In SPSS: Analyze -> Explore -> Put the variable in the “dependent list” -> Click “plots” and select “Histogram”.
Identify and exclude outliers before proceeding.
Detect outliers using SPSS: Analyze -> Explore -> Click “statistics”.
Outliers
An outlier was identified and excluded from the analysis.
Mean and Standard Deviation
Mean (M) = 49.92
Standard Deviation (SD) = 7.65
Interpreting Scores
A simulated dataset from an already validated and reliable test measuring anxiety level (high score = bad) has been administered to 500 randomly selected individuals.
Client X scored 55 on the affect test and 105 on the anxiety test.
Assuming both tests are valid, determine which score is more concerning if scores exceeding the 90th percentile indicate complaints.
Score Interpretation
Test for Affect
M = 49.92, SD = 7.65
Obtained score = 55
Difference from mean is approximately 5.
Approximately 25% of the reference group has a higher score.
Test for Anxiety
M = 100.14, SD = 3.01
Obtained score = 105
Difference from mean is approximately 5.
Only approximately 5% of the reference group has a higher score.
Z-score and T-score
Z score affect
T score affect
Z score anxiety
T score anxiety
Reliability
Reliability refers to the consistency or precision of a measurement.
What impacts reliability?
Measurement error
Measurement error is any fluctuation in scores that results from factors related to the measurement process that are irrelevant to what is being measured
Correlations
Correlation entails the relationship between two.
What is correlation?
Graphs depicting different correlations, need to determine which correlation is higher.
A, B, C (examples of correlations)
True Score
A true score is a hypothetical score entirely free of error.
An obtained score reflects:
Variability:
Common Sources of Error
Context of testing
Test taker
Test
Sources of Measurement Error and Reliability Coefficients
Source of Error | Type of Tests Prone to Each Error Source | Appropriate Measures Used to Estimate Error |
|---|---|---|
Interscorer differences | Tests scored with a degree of subjectivity | Scorer reliability |
Time sampling error | Tests of relatively stable traits or behaviors | Test-retest reliability (r) a.k.a. stability coefficient |
Content sampling error | Tests for which consistency of results, as a whole, is desired | Alternate-form reliability (r) or split-half reliability |
Interitem inconsistency | Tests that require inter-item consistency | Split-half reliability or Kuder-Richardson 20 (K-R 20) |
Interitem inconsistency and content heterogeneity combined | Tests that require inter-item consistency and homogeneity | Internal consistency measures |
Time and content sampling error combined | Tests that require stability and consistency of results, as a whole | Delayed alternate-form reliability |
Split-Half Reliability
Why?
To assess inter-item consistency.
How?
Correlate scores of participants on half of the test with the other half.
But…
Then we only have the correlation for half of the test.
We cannot simply extrapolate this.
So…
Cronbach’s Alpha
Why?
To assess inter-item / internal consistency.
How?
Perform all the possible split-half analyses on the dataset, and average the \'s à Chronbach’s alpha…
Or use the formulate and calculate by hand: https://www.youtube.com/watch?v=JkOiLUZkutc&t=273s
Or simply use SPSS/R
Item Reliability
Good item?
We want to throw away items that do not correlate well (go together) with the total score on a test.
Exercise
200 students filled out a questionnaire assessing their level of Nescafe addiction.
The questionnaire is still under development, and this was a first pilot.
Items can be scored from 0 to 10.
Please investigate the dataset, and perform a psychometric analysis.
Specifically, perform the following analyses:
1) Are there any outliers? Can you exclude them from further analysis? If so, please do exclude them.
2) Report the mean score of the scale (what did participants score on average?) and standard deviation?
3) Do you think it is a reliable test? Why?
4) Would you keep all the items, or would you exclude items from the test?
5) Assume that people > 90th percentile usually present with Nescafe addiction, should we worry for participant “40”?