ANOVA: Post Hoc Tests, Contrasts, and Reporting

ANOVA post hoc tests, contrasts, and reporting

Context: After a significant one-way ANOVA (i.e., after rejecting the global null that all group means are equal), you typically follow up with post hoc tests or planned contrasts to identify where specific mean differences lie.
Core rule: You only perform post hoc tests after you reject the global null. If you fail to reject the global null, you report that means are not different and stop there.
Five-step formal hypothesis testing framework for ANOVA (p-value approach):
- Step 1: State the hypotheses
- Null: $H0: \mu1 = \mu2 = \cdots = \muk$
- Alternative: at least one pair of means differ: $HA: \text{not all } \mui \text{ are equal}$
- Step 2: Choose the critical alpha level (commonly $\alpha = 0.05$ )
- Step 3: Compute the F statistic and obtain the p-value from the ANOVA table
- Step 4: Decision rule
- Reject $H_0$ if p < \alpha; otherwise fail to reject
- Step 5: Summarize results
- If you rejected, determine which means differ (post hoc tests) and report the group means
Notation and setup
- For one-factor ANOVA with k groups, denote group means as $\mu1, \mu2, \dots, \muk$ and group sizes as $n1, n2, \dots, nk$ ; total sample size $N = \sum{i=1}^k ni$
- Denote the mean square error (within) as $MSE = \dfrac{SS{Within}}{df{Within}}$ and the mean square between as $MS{Between} = \dfrac{SS{Between}}{df_{Between}}$
- F statistic: $F = \dfrac{MS{Between}}{MS{Within}}$ with degrees of freedom $df{Between} = k-1$ and $df{Within} = N-k$
What happens after rejection
- Post hoc tests answer: which means differ and how:
- Tukey (honestly significant difference, HSD)
- Scheffé (often referred to as Scheffé or Chaffet in some courses)
- Bonferroni (and related contrast corrections)
- These are used to control the familywise error rate across multiple comparisons
- Important practical note: the output you see from SPSS can provide p-values that are already corrected for multiple comparisons (as with Tukey). When using other methods, you may need to apply corrections yourself (e.g., Bonferroni) or use specialized procedures (e.g., Scheffé/contrast correction with an Excel calculator)

Tukey post hoc test

What Tukey does
- Compares all possible pairs of group means (pairwise comparisons)
- Uses the Studentized range distribution to adjust for multiple comparisons
- Corrected p-values are reported so that the familywise alpha level is controlled at $\alpha = 0.05$ (assuming no other adjustments)
Practical notes from SPSS output
- You will typically see paired mean differences for each pair (e.g., Med vs Ex, Med vs Diet, Ex vs Diet) and a corresponding p-value
- The output may show the mean differences in both directions (e.g., Med minus Ex and Ex minus Med); interpret based on the sign
- Read the homogeneous subsets table: columns indicate means that are not significantly different from each other; means in the same column are not significantly different, while means in different columns are significantly different
Example interpretation (blood pressure reduction with three interventions: medication, exercise, diet)
- ANOVA yield: $F(2,12) = 9.168, p = 0.004$ → reject $H_0$ , at least one mean differs
- Tukey results summary (example):
- Medication vs Exercise: p < 0.05 (significant difference; medication higher reduction)
- Diet vs Medication: not significant
- Diet vs Exercise: not significant
- Means order (example narrative): Medication highest reduction, Diet in the middle, Exercise lowest; Diet not significantly different from either medication or exercise
How to report Tukey results
- Report the F statistic from the ANOVA for context, then specify which pairs differ based on Tukey p-values
- Example phrasing: "Medication had a significantly higher mean reduction in blood pressure than Exercise (p < 0.05). Diet did not differ significantly from Medication or Exercise."
- When writing up, include means (mu_i) for each group and state which means are in the same homogeneous subset (no significant difference) and which pairs are significantly different

Planned contrasts (contrast-based follow-up)

What a contrast is
- A planned contrast is a linear combination of the means with weights that sum to zero: $L = \sum{i=1}^k wi \mui, \quad \sum{i=1}^k w_i = 0$
- The idea is to test a specific hypothesis about a particular comparison pattern (e.g., whether a subset of means differ from another subset)
- Contrast weights are applied to group means to form a single test statistic
Two key features
- The contrast value is computed on sample means: $\hat{L} = \sum{i=1}^k wi \bar{X}_i$
- The standard error uses the mean square error from the ANOVA and the weights: $SE(\hat{L}) = \sqrt{MSE \cdot \sum{i=1}^k \frac{wi^2}{n_i}}$
- Test statistic (t-test form): $t = \frac{\hat{L}}{SE(\hat{L})}$ with degrees of freedom $df = N - k$ ; equivalently, $F = t^2$ with df as above
Example contrasts for our 3-group case (medication, exercise, diet)
- Research question 1: Do natural remedies (exercise + diet) differ from medication?
- Weights: e.g., diet and exercise grouped on one side; medication on the other
- Example weights (normalized to sum to zero): Diet and Exercise on positive side, Medication on negative side
  - Contrast 1: weights [0.5, 0.5, -1]
  - Interpret L: if L > 0, Diet+Exercise mean is greater than Medication mean in the outcome
- Research question 2: Do Diet and Exercise differ from each other?
- Weights: Diet on one side +1, Exercise on the other side -1, Diet vs Exercise contrast only (Diet and Medication given zero weight as needed)
Orthogonality of contrasts
- Orthogonal contrasts are independent: the dot product of their weight vectors is zero, i.e., for two contrast vectors w and v, ( \sumi wi v_i = 0 )
- When contrasts are not orthogonal, there are complications (Scheffé vs. Bonferroni handling)
- In this course, you will be given the contrasts and not worry about orthogonality unless specifically studying non-orthogonal contrasts
Scheffé (often referred to as Chaffé in some lectures) and Bonferroni corrections for contrasts
- Scheffé (Chaffé) correction
- Used when you want to test potentially all possible contrasts, not just a small preplanned subset
- The test statistic is constructed from the contrast value L and MSE, using the contrast weights, with df = (k-1, N-k) and a special critical value that accounts for all possible contrasts
- In practice: SPSS may not directly give this for large numbers of contrasts; you may need to compute using an Excel calculator provided in course materials
- Output typically provides: l (the contrast value), contrast coefficients, MSE, and a computed F (or t) statistic, plus an alpha-corrected p-value
- Bonferroni correction for planned (or limited) contrasts
- Applicable when you have g contrasts and you want to control familywise error by adjusting alpha: $\alpha' = \dfrac{\alpha}{g}$
- You can use the uncorrected p-values from the contrast tests and compare them to the corrected alpha $\alpha'$ , or equivalently adjust the p-values by multiplying by g (capped at 1)
- When there are only a small number of contrasts (e.g., two or three), Bonferroni is common and straightforward
Choosing among these post hoc approaches
- Tukey: best when you want to explore all pairwise differences with a built-in correction; no a priori hypotheses required
- Planned contrasts (Bonferroni): preferred when you have specific hypotheses about a few contrasts; smaller number of comparisons
- Scheffé: useful when you want protection against Type I error for all possible contrasts; more conservative and typically used when many possible contrasts are of interest or when you want a single omnibus test of many patterns
- Practical note: In many classes, Tukey is the default for four or more groups; planned contrasts with Bonferroni are common when hypotheses are pre-specified; Scheffé is used when many potential contrasts are of interest and more robust error control is desired
How to implement and report (practical guidance from the course videos)
- SPSS steps for Tukey: Analyze -> Compare Means -> One-Way ANOVA -> Post Hoc -> select Tukey (avoid “Tukey SB”)
- For Scheffé: SPSS may not provide a direct option; use an external calculator (Excel) to compute F for each contrast and apply the Scheffé criterion; need L, weights, MSE, and N; report F and p-values, and the L value
- For Bonferroni: you can use SPSS to obtainthe pairwise t or F values for the requested contrasts, then adjust alpha to $\alpha' = \alpha/g$ or adjust p-values by multiplying by g; report the adjusted alpha and the test statistics (t, F) with their corrected p-values
- Reading and reporting tips
- Always report the statistic (F or t), degrees of freedom, and the p-value; for contrasts report the L value (the contrast statistic) and its interpretation
- When reporting L, you can present as: "L = [value], with F = [value], p = [value], indicating that [interpretation]"; you may also report the related mean differences in natural language
- Write up the conclusions to reflect your research questions (e.g., which treatment differs from which), not just the statistical outputs

Concrete worked examples and interpretation notes

Example data setup for 3 groups (n = 5 per group, total N = 15)
- Groups: Medication (Med), Exercise (Ex), Diet (Diet)
- After running ANOVA, you obtain: $F(2,12) = 9.168, p = 0.004$ → reject $H_0$ ; at least one mean differs
- Tukey post hoc result (example):
- Med vs Ex: p < 0.05 (significant) → Medication yields greater reduction than Exercise
- Med vs Diet: p > 0.05 (not significant) → Diet not significantly different from Medication
- Diet vs Ex: p > 0.05 (not significant) → Diet not significantly different from Exercise
- Means order (descriptive): Med > Diet > Ex (in terms of mean reduction)
- Interpretation nuance: Although Med is significantly higher than Ex, Diet sits in between and is not significantly different from either Med or Ex according to this Tukey adjustment
Contrasts example (planned contrasts)
- Contrast 1 (C1): Medication vs (Diet + Exercise)
- Weights: wMed = -1, wDiet = 0.5, w_Ex = 0.5 (sums to zero)
- Contrast value: L = -1\muMed + 0.5\muDiet + 0.5*\mu_Ex
- Standard error: SE = sqrt(MSE * (wMed^2/nMed + wDiet^2/nDiet + wEx^2/nEx))
- Test: t = L / SE; df = N - k = 15 - 3 = 12; or F = t^2; compare to t{12} or F{1,12}
- Contrast 2 (C2): Diet vs Exercise
- Weights: wDiet = 1, wEx = -1, w_Med = 0 (zero weight on Med)
- Ensure weights sum to zero and interpret sign of L similarly
- Reporting contrasts (example language):
- For C1 (Bonferroni-corrected or Scheffé): "Medication was significantly different from the combined Diet and Exercise (L = [value], F = [value], p = [value], with L representing the contrast)."
- For C2: "Diet differed from Exercise (L = [value], p = [value], after correction)."
Bonferroni vs Scheffé vs Tukey in practice
- When you have two planned contrasts (g = 2): Bonferroni alpha' = 0.05/2 = 0.025. Compare uncorrected p-values to 0.025 or equivalently adjust p-values by multiplying by 2
- When you have many contrasts or non-orthogonal contrasts, Scheffé is used; you may compute via an Excel tool that takes L, weights, MSE, k, and N to produce an F value and p-value
- For large numbers of contrasts, SPSS may not support Scheffé directly; you’ll use the Excel-based calculator and then report the F and p-values, plus the L values
- Example reporting snippet for a contrast tested with Scheffé: "Using Scheffé’s method, the contrast Med vs (Diet + Ex) was significant (F = [value], p = [value], L = [value]), indicating Medication yields greater mean reduction than the combination of Diet and Exercise"
Practical SPSS tips mentioned in the session
- For post hoc outputs, always read the p-values against the chosen alpha (0.05 by default) unless you apply a correction
- When copying ANOVA tables for reports, you can capture them as images to paste into documents
- If you need to show multiple SPSS outputs in a write-up, include the F or t statistic, df, and p-value for each test (and the L value for contrasts), plus a brief interpretation
Summary takeaways
- ANOVA tells you if there is any difference among group means; post hoc tests or contrasts tell you where the differences lie
- Tukey is the default for all pairwise comparisons with built-in multiple comparison correction
- Planned contrasts allow testing specific hypotheses about a set of means; Bonferroni controls the Type I error rate when a small number of contrasts are tested
- Scheffé (Chaffé) controls for all possible contrasts and is more conservative; use an extra calculator when needed
- Contrast weights must sum to zero; the contrast value L and its standard error determine the test statistic, and signs of weights determine which group means are higher or lower
- Always report both the statistic (F or t), degrees of freedom, p-value, and a clear interpretation that ties back to the research questions

Practical reporting templates (short examples)

Tukey, all pairwise differences
- "One-way ANOVA showed a significant effect of intervention type on blood pressure reduction, F(2,12) = 9.168, p = 0.004. Tukey post hoc comparisons indicated that Medication produced a significantly greater mean reduction than Exercise (p < 0.05); Diet did not differ significantly from Medication or Exercise. Means: Med = [value], Diet = [value], Ex = [value]."
Planned contrasts (Bonferroni example with two contrasts)
- "Planned contrasts tested: (1) Medication vs (Diet + Ex) and (2) Diet vs Ex. For (1), the contrast was significant (t = [value], p = [value], after Bonferroni correction α' = 0.025). For (2), the contrast was not significant (t = [value], p = [value], after correction)."]
Scheffé (all contrasts) example
- "Using Scheffé's method, the contrast L = -1Med + 0.5Diet + 0.5*Ex was significant (F = [value], p = [value], L = [value]), indicating Medication differed from the average of Diet and Exercise."

Final tips

Always plan your contrasts before running analyses if you have specific hypotheses; this informs whether you use Tukey, Bonferroni, or Scheffé
When you read the outputs, focus first on whether the ANOVA rejected the global null; then examine post hoc results in the context of your research questions
Practice constructing contrast weights and computing L and SE to build intuition for what the tests are actually testing
Remember the difference between pairwise comparisons (Tukey) and planned contrasts (specific hypotheses) and how the correction method aligns with your study design

Quick cheat sheet (recap)

ANOVA null: $H0: \mu1 = \mu2 = \cdots = \muk$ ; Alternative: at least one pair differs
Post hoc when global null rejected:
- Tukey: all pairwise; corrected p-values; interpret homogeneous subsets
- Bonferroni: divide alpha by number of contrasts; apply to uncorrected p-values
- Scheffé (Chaffé): test all possible contrasts; use L and MSE with appropriate df; often excel-based calculator needed
Contrasts: L = \sum wi \mui, with \sum wi = 0; test with t = \hat{L}/SE(\hat{L}), SE = \sqrt{MSE \cdot \sum wi^2/n_i}
Interpretation: report which means differ, by how much, and in which direction; provide context to research questions