Dr. Ed Walford
Course: PY2501 Week 6: Data Screening and Normalising Techniques
Quote: "Every man takes the limits of his own field of vision for the limits of the world." – Arthur Schopenhauer
Understand and perform statistical analyses.
Report methods and results of psychology research studies clearly.
Assumptions about data validity:
Participants have read and thought about every survey item.
Responses to experimental stimuli are appropriate with few errors.
Indicators of validity:
Completion time of surveys: Measure how long participants took.
Accuracy of responses: Evaluate the reliability of responses.
Completion time for a survey can indicate validity:
Range from 40 seconds to over 4.5 hours.
Median time: 154 seconds (just over 2.5 minutes).
Example: Individual completion time observed: 1 minute 20 seconds for reasonable attention.
Recommendation: Exclude participant data if they completed the survey in less than 1 minute.
Outlier identification by using Interquartile Range (IQR):
Multiplication of IQR by 1.5 to determine cutoff points for exclusion.
For example, exclude data below 60 seconds and above 295 seconds based on the calculated outlier thresholds.
Tools for filtering data in software (e.g. jamovi).
Importance of a large sample size to accommodate data exclusion justifications.
Significant issue: Identifying non-engagement through response patterns:
Consistent key presses across items may indicate lack of effort.
Check for shortened response times or low-accuracy scores in experimental setups.
Recommendation: Remove data from such participants for validity.
Validity can be assessed using completion time and response accuracy as criteria.
For instance, responses taking excessively short times (under 200ms) may not be valid.
Use of statistical criteria like 1.5 times IQR can guide exclusions.
In experiments, identify participants showing chance-level responses and consider excluding them to enhance data quality.
Suggested reasonable reaction time threshold: less than 200ms seems invalid for processing.
General definition of outliers using the 1.5 x IQR rule based on congruent/incongruent reaction times.
Normalising transformations may be considered if distribution is non-normal:
Use mathematical operators to change the shape of the data distribution.
Keep participant ranks intact, preserving the order of scores.
Different methods of transformation:
For positively skewed data: Square root, Logarithmic transformations.
For negatively skewed data: Power transformations.
An efficient method for normalising data compared to simpler transformations.
Relies on identifying an optimal lambda value.
Transformation of data may yield results that no longer resemble original data (like ages or reaction times).
Tools available for lambda determination (e.g., online Box-Cox tools).
Consistency is key in approaches to screening and transforming your data; ensure clear documentation.
It's essential to justify all decisions made regarding exclusions and transformations while writing study results.
Suggested reading for more detailed understanding: Coolican (2019) on experimental methods and research statistics.