Cheeky Test (Tukey Test) and SNK Test

A post-hoc test used after ANOVA when the ANOVA result is significant.
Appropriate when there are four or more groups.
Mnemonic: SNK (Student-Newman-Keuls) is for 3 groups; Tukey is for 4 or more groups.
The test identifies which groups are significantly different from each other.

The critical difference helps in identifying how far apart the means of each group need to be for the difference to be statistically significant (at the 0.05 level).
Calculated using a formula involving:
- A value obtained from statistical tables or software, based on:
  - Alpha level (typically 0.05)
  - The number of means being compared (R)
  - Degrees of freedom error
- Mean square error
- Sample size
Formula: $CD = Q * \sqrt{\frac{MSE}{n}}$
- Where:
  - $Q$ is the value from the Tukey's range distribution table.
  - $MSE$ is the mean square error from the ANOVA.
  - $n$ is the sample size per group.

Degrees of freedom error = 45 (e.g., 9 participants in each of 5 groups).
Using a Tukey's range distribution table (e.g., Hinkle et al.), find the Q value.
- For alpha = 0.05 and degrees of freedom error = 40, Q = 4.04.
- For alpha = 0.05 and degrees of freedom error = 60, Q = 3.98.
- Approximating for degrees of freedom error = 45, Q ≈ 4.02.
Given mean square error = 9.67 and n = 10, calculate the critical difference:
$CD = 4.02 * \sqrt{\frac{9.67}{10}} = 3.95$
Interpretation: If the difference between any two group means is greater than 3.95, the difference is significant at the 0.05 level.

If the difference between the means of two groups exceeds the critical difference (e.g., 3.95), then the groups are considered significantly different at the 0.05 level.
Example: If the difference between the means of IST and Imagery groups is greater than 3.95, they are significantly different.

When examining figures presenting group means and confidence intervals, check for overlap between the confidence intervals.
If the overlap is more than 25% of the interval length, the difference is not statistically significant at the 0.05 level.
If there is no overlap, there is a significant difference.

Software output indicates which group means differ significantly.
Example: Motivational self-talk and instructional self-talk may not be significantly different from each other, even if one mean is slightly higher than the other, because the null hypothesis suggests any observed difference could be due to experimental error or within-group error, not the treatment.
Conclusion: Instructional self-talk may not be as effective as unrelated self-talk, imagery, or modeling.

Report whether the post-hoc test (e.g., Tukey) revealed significant differences at the 0.05 level.
Unlike t-tests, specific p-values (e.g., 0.031) are not typically reported for post-hoc tests; only whether the difference is significant or not at the 0.05 level.
The Tukey test is different from paired t-tests in its formula to avoid inflating Type I error.
The Tukey test should only be used if the ANOVA showed a significant overall finding.
Example Write-up: "Post-hoc analysis using the Tukey test revealed a significant difference between groups A and B at the 0.05 level."