Data Preparation
Data Preparation Steps
- Learning Outcomes:
- Describe steps for preparing a data file for analysis.
- Generate descriptives to identify data entry errors and correct them by recoding variables.
- Conduct missing value analysis in Jamovi to identify the extent and pattern of missing data (MAR, MCAR, MNAR).
- Explain strengths and weaknesses of different approaches to handling missing data.
- Reverse code items, calculate scale scores, and assess internal consistency in Jamovi.
Introduction to Data Preparation
- Data errors can arise due to:
- Mistyping during manual entry (e.g., entering 55 instead of 5).
- Participants misinterpreting questions (e.g., entering height in meters instead of cm).
- Software mislabeling responses (e.g., 0-4 instead of 1-5).
Strategies for Checking Data Accuracy
- Proofread the data:
- Review the data file to identify any values that seem out of place.
- Effective for small datasets but challenging for large ones.
- Check descriptive statistics:
- Generate means, standard deviations, minimum, and maximum values.
- Identify implausible numbers that don't make sense for the variable.
- Examine histograms and boxplots:
- Quickly spot scores outside the expected range.
- Boxplots in Jamovi highlight problematic participants.
Histograms and Boxplots
- Histograms:
- Show the distribution of scores, with clustering indicating common values and outliers appearing as isolated points.
- May require manual searching to identify specific participants with outlier scores.
- Boxplots:
- Visually represent the distribution, central tendency, and outliers.
- Jamovi boxplots often indicate the row number of outlier participants directly.
Addressing Errors
- Minor Errors
- If the correct score can be estimated with high confidence, enter the corrected value.