Extend ANOVA concepts to repeated-measures and factorial designs.
Variance Partitioning
Total variance: Represents the variation among all scores in a dataset.
Between-groups variance: Variation attributed to differences in group means.
Within-groups variance: Refers to unexplained or random variation among individual scores within a group.
ANOVA relies on sums of squares to compare total variance against between- and within-groups variance.
The F-Ratio & Significance
The formula for F-ratio is defined as:
F = \frac{MS{Between}}{MS{Within}}
A large F suggests that there are significant systematic differences among group means, while a small F indicates that most variability can be attributed to random error.
A p-value < 0.05 provides evidence that not all group means are equal, indicating statistical significance.
What ANOVA Doesn’t Tell Us
A significant F value reveals that at least one group mean differs from others; however, it does not specify which group means are significantly different.
Post hoc tests are necessary to identify specific group differences after finding a significant overall ANOVA.
What Post Hoc Tests Do
Post hoc tests perform pairwise comparisons following a significant F test, allowing researchers to understand which specific means are different.
These tests adjust for multiple comparisons to mitigate the chance of Type I errors (α).
Common examples of post hoc tests include:
Tukey's HSD
Bonferroni
Scheffé
Controlling Error
Familywise α: Represents the probability of encountering at least one Type I error among all tests conducted.
Post hoc tests implement stricter thresholds for statistical significance to compensate for the increased chance of error due to repeated measures.
For instance, Tukey’s HSD maintains an overall α level of 0.05.
Common Post Hoc Tests
Tukey’s HSD: This test is mainly used when group sizes are equal; it's the most common post hoc test.
Bonferroni: A straightforward approach that is conservative and effective, especially when dealing with unequal sample sizes.
Scheffé: Highly flexible and considered the most conservative option. It can be used for complex or unplanned contrasts.
When to Use Each Post Hoc Test
Tukey's HSD: Ideal for situations where groups are of equal size.
Bonferroni: Best suited for planned or limited comparisons.
Scheffé: Recommended for complex analyses or unplanned contrasts that require flexibility.
Knowledge SPSS Output
Descriptive Statistics
The output provides key statistics including:
Sample Size (N)
Mean
Standard Deviation
Standard Error
95% Confidence Interval for each cohort.
Example data:
For Cohort A: N = 26, Mean = 69.6396, Std. Deviation = 8.57670, 95% CI = [66.1754, 73.1038]
For Cohort B: N = 31, Mean = 77.4394, Std. Deviation = 6.63746, 95% CI = [75.0047, 79.8740]
SPSS Output - Multiple Comparisons (Tukey HSD)
This table shows the mean differences between groups along with standard errors and significance levels for the pairwise comparisons conducted:
Example comparison results show:
Cohort A vs. Cohort B: Mean Difference = -7.79974, Significant (p = .003)
Cohort C vs. Cohort D: Mean Difference = -9.03038, Significant (p < .001)
Reading Results
Focus should be placed on significantly differing group pairs. For instance:
If Group 1 and Group 3 differ significantly (p = .01) while others do not, this should be noted for further discussion.
Why Effect Size Matters
Statistical Significance ≠ Practical Importance: Just because a difference is statistically significant does not mean it has practical implications.
Effect size provides insight into the magnitude of group differences.
It communicates how much of the variation in outcomes can be explained by the independent variable (IV).
η² vs. Cohen’s d
η² (Eta-squared): Represents the percentage of total variance explained by the independent variable in ANOVA.
Cohen’s d: Represents the mean difference expressed in standard deviation units, typically used in t-tests.
Both metrics reflect the strength of a relationship; η² is proportion-based while Cohen’s d focuses on mean differences in relation to variability.
Formula and Intuition for η²
The formula for computing η² is:
η² = \frac{SS{Between}}{SS{Total}}
Where:
SS{Total} = SS{Between} + SS_{Within}
Interpretation of η² values:
0.01 = small effect, 0.06 = medium effect, 0.14 = large effect.
For example, η² = 0.25 indicates that approximately 25% of the variance in scores is explained by the condition.
Example Calculation
Given:
SS_{Between} = 240
SS_{Within} = 720
Thus, SS_{Total} = 240 + 720 = 960
Calculation of η²:
η² = \frac{SS{Between}}{SS{Total}} = \frac{240}{960} = 0.25
Interpretation: 25% of the total variance is attributed to group differences.
The Complete Story of ANOVA
Is F significant?
How large is η²?
Which groups differ (post hoc)?
Each of these questions addresses unique aspects of the research inquiry.
Interpreting F and p Values
An example interpretation might include:
F(2, 27) = 4.63, p = .019 indicates you reject the null hypothesis.
This suggests that at least one group mean differs significantly from the others.
Interpreting η² and Post Hoc Results
If η² = .25, it indicates a large effect size.
Post hoc comparisons might reveal Group A > Group C, with significance at p = .01. This illustrates both the significance and the strength of the effect.
SPSS Output - ANOVA Results for Satisfaction
Sum of Squares Summary: This includes:
Between Groups: 1187.373 (df = 4, Mean Square = 296.843)
Within Groups: 9596.565 (Error df = 141, Mean Square = 68.061)
Example outputs show mean differences, confidence intervals, and significance between different types of industries:
For Academic vs Community: Mean Difference = 0.93041, not statistically significant (p = 0.993).
For Nonprofit comparison, a significant result is found (p < .001).
Repeated Measures ANOVA
Definition: Involves measuring the same participants across multiple conditions (e.g., pretest, posttest, follow-ups).
Advantages: It controls for individual differences by using the same individuals under different conditions.
This approach extends the capabilities of the paired t-test beyond just two timepoints.
Factorial ANOVA
Incorporates multiple factors (e.g., Gender × Training interaction).
Tests not only for main effects of each factor but also interactions between factors.
2-Way ANOVA Main Effects Analysis
For example, if evaluating performance based on gender and training:
Males with training = 82, males without training = 75.
Females with training = 90, females without training = 85.
Investigation required into whether there is a significant main effect for Gender, Training, or possible interaction where training affects men and women differently.
SPSS Output - Tests of Between-Subjects Effects
Output summary includes:
Corrected Model: 1541.813 (df = 3, Mean Square = 513.938, p < .001)
Gender: Significant (p < .001)
Training: Significant (p < .001)
Gender × Training Interaction: Significant (p < .001)
Overall R squared = 0.777 (Adjusted R Squared = 0.759).
Estimated Marginal Means of Performance in SPSS Output
The SPSS output visually presents performance data by gender under training and no training conditions, enabling easy comparison and evaluation using error bars for 95% confidence intervals.