Data Preparation

Data Preparation Steps

  • Learning Outcomes:
    • Describe steps for preparing a data file for analysis.
    • Generate descriptives to identify data entry errors and correct them by recoding variables.
    • Conduct missing value analysis in Jamovi to identify the extent and pattern of missing data (MAR, MCAR, MNAR).
    • Explain strengths and weaknesses of different approaches to handling missing data.
    • Reverse code items, calculate scale scores, and assess internal consistency in Jamovi.

Introduction to Data Preparation

  • Data errors can arise due to:
    • Mistyping during manual entry (e.g., entering 55 instead of 5).
    • Participants misinterpreting questions (e.g., entering height in meters instead of cm).
    • Software mislabeling responses (e.g., 0-4 instead of 1-5).

Strategies for Checking Data Accuracy

  • Proofread the data:
    • Review the data file to identify any values that seem out of place.
    • Effective for small datasets but challenging for large ones.
  • Check descriptive statistics:
    • Generate means, standard deviations, minimum, and maximum values.
    • Identify implausible numbers that don't make sense for the variable.
  • Examine histograms and boxplots:
    • Quickly spot scores outside the expected range.
    • Boxplots in Jamovi highlight problematic participants.

Histograms and Boxplots

  • Histograms:
    • Show the distribution of scores, with clustering indicating common values and outliers appearing as isolated points.
    • May require manual searching to identify specific participants with outlier scores.
  • Boxplots:
    • Visually represent the distribution, central tendency, and outliers.
    • Jamovi boxplots often indicate the row number of outlier participants directly.

Addressing Errors

  • Minor Errors
    • If the correct score can be estimated with high confidence, enter the corrected value.
      • Example: Changing