PY2501 week 6 data screening and normalising techniques - Tagged

Introduction

  • Dr. Ed Walford

  • Course: PY2501 Week 6: Data Screening and Normalising Techniques

  • Quote: "Every man takes the limits of his own field of vision for the limits of the world." – Arthur Schopenhauer

Learning Outcomes

  • Understand and perform statistical analyses.

  • Report methods and results of psychology research studies clearly.

Validity of Data

  • Assumptions about data validity:

    • Participants have read and thought about every survey item.

    • Responses to experimental stimuli are appropriate with few errors.

  • Indicators of validity:

    • Completion time of surveys: Measure how long participants took.

    • Accuracy of responses: Evaluate the reliability of responses.

Survey Completion Time

  • Completion time for a survey can indicate validity:

    • Range from 40 seconds to over 4.5 hours.

    • Median time: 154 seconds (just over 2.5 minutes).

  • Example: Individual completion time observed: 1 minute 20 seconds for reasonable attention.

  • Recommendation: Exclude participant data if they completed the survey in less than 1 minute.

Managing Outliers and Data Exclusion

  • Outlier identification by using Interquartile Range (IQR):

    • Multiplication of IQR by 1.5 to determine cutoff points for exclusion.

    • For example, exclude data below 60 seconds and above 295 seconds based on the calculated outlier thresholds.

  • Tools for filtering data in software (e.g. jamovi).

  • Importance of a large sample size to accommodate data exclusion justifications.

Response Patterns

  • Significant issue: Identifying non-engagement through response patterns:

    • Consistent key presses across items may indicate lack of effort.

    • Check for shortened response times or low-accuracy scores in experimental setups.

    • Recommendation: Remove data from such participants for validity.

Assessing Validity with Timing and Accuracy

  • Validity can be assessed using completion time and response accuracy as criteria.

    • For instance, responses taking excessively short times (under 200ms) may not be valid.

  • Use of statistical criteria like 1.5 times IQR can guide exclusions.

  • In experiments, identify participants showing chance-level responses and consider excluding them to enhance data quality.

Reaction Times in Experimental Studies

  • Suggested reasonable reaction time threshold: less than 200ms seems invalid for processing.

  • General definition of outliers using the 1.5 x IQR rule based on congruent/incongruent reaction times.

Data Distribution and Transformation

  • Normalising transformations may be considered if distribution is non-normal:

    • Use mathematical operators to change the shape of the data distribution.

    • Keep participant ranks intact, preserving the order of scores.

  • Different methods of transformation:

    • For positively skewed data: Square root, Logarithmic transformations.

    • For negatively skewed data: Power transformations.

Box-Cox Transformations

  • An efficient method for normalising data compared to simpler transformations.

    • Relies on identifying an optimal lambda value.

    • Transformation of data may yield results that no longer resemble original data (like ages or reaction times).

  • Tools available for lambda determination (e.g., online Box-Cox tools).

Best Practices in Data Analysis

  • Consistency is key in approaches to screening and transforming your data; ensure clear documentation.

  • It's essential to justify all decisions made regarding exclusions and transformations while writing study results.

  • Suggested reading for more detailed understanding: Coolican (2019) on experimental methods and research statistics.