Study Notes on Post Hoc Tests

Overview of Post Hoc Tests

  • Post hoc tests are used after an initial analysis (like ANOVA) to determine which specific means are different.
  • When conducting tests, one must balance the power of the test with the risk of type one errors (false positives).

Power of Tests

  • Power: The probability of correctly rejecting the null hypothesis when it is false; i.e., finding a difference if one exists.
  • A more conservative test has less power because it is more cautious in making decisions about differences.
  • The trade-off between power and type one error risk means that as tests become less conservative, they provide greater power but also increase the risk of type one errors.

Fisher's Least Significant Difference (LSD)

  • Fisher's LSD: Also called Fisher's protected t-test, where LSD stands for Least Significant Difference.
  • Works similarly to multiple t-tests conducted a priori (before data collection).
  • Useful for comparing specific means, say comparing peer mentors to grad students in an example.
  • Involves running multiple t-tests post hoc after an ANOVA to investigate all mean comparisons (e.g., peer vs. AI, peer vs. grad, AI vs. grad).
  • If more than three comparisons are made, this test may not be appropriate as it can lead to a high type one error.
  • Type one error refers to the erroneous rejection of the null hypothesis (concluding a difference when there is none).

Tukey's Honestly Significant Difference (HSD)

  • Tukey’s HSD: Stands for Honestly Significant Difference. It uses a specific q-value table for calculations.
  • The essence of Tukey's method is that it uses a critical value from the q-table based on the number of groups and degrees of freedom.
  • Range (k): Represents the number of groups involved in the study. For three groups, k=3.
  • Tukey’s HSD provides a more conservative approach when compared to Fisher’s LSD, reducing the risk of type one errors while keeping power at a reasonable level.
  • Works best when group sizes are equal, specifically when performing pairwise comparisons. This ensures validity in its assumptions.
  • The q observed value calculated is then compared against a critical value from the q-table to determine significance.

Comparing Power and Risk in Post Hoc Tests

  • If the critical value is large, more significant differences between means are required to declare significance, thus possibly reducing power.
  • If planned appropriately, one can design studies considering the number of groups (k) and the consequent critical values.
  • Designing studies needs to balance the number of groups against the potential for type one errors:
    • More groups can dilute the power of a test, making it harder to find significant differences.

Other Tests and Considerations

  • Bonferroni's Method: Another post hoc test that adjusts the significance level based on the number of tests being conducted. This involves family-wise error rate control.
  • Scheffé's Test: Noted to be a low power test, primarily effective for complex comparisons, including linear contrasts.
  • Dunnett’s Test: Useful when comparing multiple treatment groups against a single control group, allowing for the determination of whether any treatment is better than the control.

Family-Wise Error Rate

  • Represents the total probability of making one or more type one errors across multiple comparisons.
  • It is crucial for statistical integrity to control this when making several comparisons, ensuring valid inferences from the data.

SPSS Practical Application

  • Discussed utilizing SPSS for carrying out ANOVA and post hoc tests.
  • When analyzing data, enter dependent variables accurately within the software and apply the comparison tools available.
  • Different tests available will highlight differences across means, factoring in significance levels and p-values.

Key Takeaways

  • The choice of post hoc test impacts the validity of findings. It is important to understand the underlying assumptions and conditions pertaining to each test.
  • Both Fisher’s LSD and Tukey’s HSD have strengths and weaknesses related to power and risk of type one error and should be chosen based on specific study designs and hypotheses.
  • SPSS provides valuable tools for calculating and visualizing these tests, and it is crucial to handle the outputs (p-values, mean differences) correctly.