module 3 stat part 5
Analysis of Variance (ANOVA) - Comparing More Than Two Means
Introduction
Focus on analysis comparing more than two means.
Reminder about available resources (online quizzes, Q&A sessions, and email consultations).
Transition from t-test (for two groups) to F-test for more than two groups.
Importance of ANOVA
Comparison Context: e.g., Comparing lifespan of patients from four hospitals.
Multiple comparisons (6 pairs) increase complexity and potential error in significance levels.
Type I Error: Probability of incorrectly rejecting the null hypothesis.
6 pair tests yield an overall Type I error of 0.265 versus the desired 0.05 significance level.
What is ANOVA?
Definition: One-Way Analysis of Variance (ANOVA) used when comparing means of multiple groups.
Similarities to T-Test: Same general assumptions for normality and equal variances, with consideration for independence of observations.
Hypotheses in ANOVA
Null Hypothesis (H0): All group population means are the same.
Alternative Hypothesis (H1): At least one group population mean differs from the others.
Components of Variation in ANOVA
Two main components of variation:
Between-group Variation: Difference due to groups.
Within-group Variation: Random or unexplained variation within groups.
Example illustrated with severe vs mild disease groups highlights domination of differences or masking effects of variation.
Example of Therapy Impact on Anxiety Levels
Study Design: Three groups of students (A, B, C) receive different hours of therapy (5, 10, 15 hours) respectively.
Analysis includes assessment of anxiety post-therapy; balanced data from equal groups.
Box Plot Representation: Helps visualize differences; group B higher than C and A.
Conducting ANOVA (Calculation Overview)
Total Variation: Variance calculation similar to previous modules, using an overall mean for all observations (150 students).
Partitioning Total Variation:
Between-group Sum of Squares (SSB)
Within-group Sum of Squares (SSW)
Degrees of Freedom: Different for each sum of squares component (calculated depending on group sizes).
Conversion to Mean Squares:
Total mean square = Total SS / (n-1)
Between Mean Square = SSB / (k-1)
Within Mean Square = SSW / (n-k)
Statistical Ratio in ANOVA
F-statistic: Ratio of between-group variation to within-group variation (F = SSB/MSW).
Indicates the strength of the difference between groups; larger values suggest significant group differences.
Distribution: F-statistic follows an F-distribution, dependent on degrees of freedom from SSB and SSW.
Results Interpretation
Example yields F-statistic (e.g., 18.