Interpreting and Extending ANOVA

Lecture Objectives

  • Review the logic behind one-way ANOVA.
  • Understand why and when to run post hoc tests.
  • Interpret ANOVA effect sizes (η²).
  • Learn how to interpret a complete ANOVA output.
  • Extend ANOVA concepts to repeated-measures and factorial designs.

Variance Partitioning

  • Total variance: Represents the variation among all scores in a dataset.
  • Between-groups variance: Variation attributed to differences in group means.
  • Within-groups variance: Refers to unexplained or random variation among individual scores within a group.
  • ANOVA relies on sums of squares to compare total variance against between- and within-groups variance.

The F-Ratio & Significance

  • The formula for F-ratio is defined as:
    F = \frac{MS{Between}}{MS{Within}}
  • A large F suggests that there are significant systematic differences among group means, while a small F indicates that most variability can be attributed to random error.
  • A p-value < 0.05 provides evidence that not all group means are equal, indicating statistical significance.

What ANOVA Doesn’t Tell Us

  • A significant F value reveals that at least one group mean differs from others; however, it does not specify which group means are significantly different.
  • Post hoc tests are necessary to identify specific group differences after finding a significant overall ANOVA.

What Post Hoc Tests Do

  • Post hoc tests perform pairwise comparisons following a significant F test, allowing researchers to understand which specific means are different.
  • These tests adjust for multiple comparisons to mitigate the chance of Type I errors (α).
  • Common examples of post hoc tests include:
    • Tukey's HSD
    • Bonferroni
    • Scheffé

Controlling Error

  • Familywise α: Represents the probability of encountering at least one Type I error among all tests conducted.
  • Post hoc tests implement stricter thresholds for statistical significance to compensate for the increased chance of error due to repeated measures.
  • For instance, Tukey’s HSD maintains an overall α level of 0.05.

Common Post Hoc Tests

  • Tukey’s HSD: This test is mainly used when group sizes are equal; it's the most common post hoc test.
  • Bonferroni: A straightforward approach that is conservative and effective, especially when dealing with unequal sample sizes.
  • Scheffé: Highly flexible and considered the most conservative option. It can be used for complex or unplanned contrasts.

When to Use Each Post Hoc Test

  1. Tukey's HSD: Ideal for situations where groups are of equal size.
  2. Bonferroni: Best suited for planned or limited comparisons.
  3. Scheffé: Recommended for complex analyses or unplanned contrasts that require flexibility.

Knowledge SPSS Output

Descriptive Statistics

  • The output provides key statistics including:
    • Sample Size (N)
    • Mean
    • Standard Deviation
    • Standard Error
    • 95% Confidence Interval for each cohort.
    • Example data:
    • For Cohort A: N = 26, Mean = 69.6396, Std. Deviation = 8.57670, 95% CI = [66.1754, 73.1038]
    • For Cohort B: N = 31, Mean = 77.4394, Std. Deviation = 6.63746, 95% CI = [75.0047, 79.8740]

SPSS Output - Multiple Comparisons (Tukey HSD)

  • This table shows the mean differences between groups along with standard errors and significance levels for the pairwise comparisons conducted:
    • Example comparison results show:
    • Cohort A vs. Cohort B: Mean Difference = -7.79974, Significant (p = .003)
    • Cohort C vs. Cohort D: Mean Difference = -9.03038, Significant (p < .001)

Reading Results

  • Focus should be placed on significantly differing group pairs. For instance:
    • If Group 1 and Group 3 differ significantly (p = .01) while others do not, this should be noted for further discussion.

Why Effect Size Matters

  • Statistical Significance ≠ Practical Importance: Just because a difference is statistically significant does not mean it has practical implications.
  • Effect size provides insight into the magnitude of group differences.
  • It communicates how much of the variation in outcomes can be explained by the independent variable (IV).

η² vs. Cohen’s d

  • η² (Eta-squared): Represents the percentage of total variance explained by the independent variable in ANOVA.
  • Cohen’s d: Represents the mean difference expressed in standard deviation units, typically used in t-tests.
  • Both metrics reflect the strength of a relationship; η² is proportion-based while Cohen’s d focuses on mean differences in relation to variability.

Formula and Intuition for η²

  • The formula for computing η² is:
    η² = \frac{SS{Between}}{SS{Total}}
  • Where:
    • SS{Total} = SS{Between} + SS_{Within}
  • Interpretation of η² values:
    • 0.01 = small effect, 0.06 = medium effect, 0.14 = large effect.
    • For example, η² = 0.25 indicates that approximately 25% of the variance in scores is explained by the condition.

Example Calculation

  • Given:
    • SS_{Between} = 240
    • SS_{Within} = 720
    • Thus, SS_{Total} = 240 + 720 = 960
  • Calculation of η²:
    η² = \frac{SS{Between}}{SS{Total}} = \frac{240}{960} = 0.25
  • Interpretation: 25% of the total variance is attributed to group differences.

The Complete Story of ANOVA

  1. Is F significant?
  2. How large is η²?
  3. Which groups differ (post hoc)?
  • Each of these questions addresses unique aspects of the research inquiry.

Interpreting F and p Values

  • An example interpretation might include:
    • F(2, 27) = 4.63, p = .019 indicates you reject the null hypothesis.
    • This suggests that at least one group mean differs significantly from the others.

Interpreting η² and Post Hoc Results

  • If η² = .25, it indicates a large effect size.
  • Post hoc comparisons might reveal Group A > Group C, with significance at p = .01. This illustrates both the significance and the strength of the effect.

SPSS Output - ANOVA Results for Satisfaction

  • Sum of Squares Summary: This includes:
    • Between Groups: 1187.373 (df = 4, Mean Square = 296.843)
    • Within Groups: 9596.565 (Error df = 141, Mean Square = 68.061)
    • Total: 10783.937 (Total df = 145)
  • ANOVA results yield significant findings (p = .002) suggesting differences among groups.

ANOVA Effect Sizes in SPSS Output

  • Estimated effect sizes reported include:
    • Eta-squared: Point estimate = .110 (95% CI [0.017, 0.192])
    • Epsilon-squared: Point estimate = .085 (95% CI [-0.011, 0.169])
    • Omega-squared (Fixed effect): Point estimate = .084, Omega-squared (Random effect) = .023.

Tukey HSD - Multiple Comparisons for Satisfaction

  • Example outputs show mean differences, confidence intervals, and significance between different types of industries:
    • For Academic vs Community: Mean Difference = 0.93041, not statistically significant (p = 0.993).
    • For Nonprofit comparison, a significant result is found (p < .001).

Repeated Measures ANOVA

  • Definition: Involves measuring the same participants across multiple conditions (e.g., pretest, posttest, follow-ups).
  • Advantages: It controls for individual differences by using the same individuals under different conditions.
  • This approach extends the capabilities of the paired t-test beyond just two timepoints.

Factorial ANOVA

  • Incorporates multiple factors (e.g., Gender × Training interaction).
  • Tests not only for main effects of each factor but also interactions between factors.

2-Way ANOVA Main Effects Analysis

  • For example, if evaluating performance based on gender and training:
    • Males with training = 82, males without training = 75.
    • Females with training = 90, females without training = 85.
  • Investigation required into whether there is a significant main effect for Gender, Training, or possible interaction where training affects men and women differently.

SPSS Output - Tests of Between-Subjects Effects

  • Output summary includes:
    • Corrected Model: 1541.813 (df = 3, Mean Square = 513.938, p < .001)
    • Gender: Significant (p < .001)
    • Training: Significant (p < .001)
    • Gender × Training Interaction: Significant (p < .001)
    • Overall R squared = 0.777 (Adjusted R Squared = 0.759).

Estimated Marginal Means of Performance in SPSS Output

  • The SPSS output visually presents performance data by gender under training and no training conditions, enabling easy comparison and evaluation using error bars for 95% confidence intervals.