module 3 stat part 5

Analysis of Variance (ANOVA) - Comparing More Than Two Means

Introduction

  • Focus on analysis comparing more than two means.

  • Reminder about available resources (online quizzes, Q&A sessions, and email consultations).

  • Transition from t-test (for two groups) to F-test for more than two groups.

Importance of ANOVA

  • Comparison Context: e.g., Comparing lifespan of patients from four hospitals.

  • Multiple comparisons (6 pairs) increase complexity and potential error in significance levels.

    • Type I Error: Probability of incorrectly rejecting the null hypothesis.

    • 6 pair tests yield an overall Type I error of 0.265 versus the desired 0.05 significance level.

What is ANOVA?

  • Definition: One-Way Analysis of Variance (ANOVA) used when comparing means of multiple groups.

  • Similarities to T-Test: Same general assumptions for normality and equal variances, with consideration for independence of observations.

Hypotheses in ANOVA

  • Null Hypothesis (H0): All group population means are the same.

  • Alternative Hypothesis (H1): At least one group population mean differs from the others.

Components of Variation in ANOVA

  • Two main components of variation:

    • Between-group Variation: Difference due to groups.

    • Within-group Variation: Random or unexplained variation within groups.

  • Example illustrated with severe vs mild disease groups highlights domination of differences or masking effects of variation.

Example of Therapy Impact on Anxiety Levels

  • Study Design: Three groups of students (A, B, C) receive different hours of therapy (5, 10, 15 hours) respectively.

  • Analysis includes assessment of anxiety post-therapy; balanced data from equal groups.

  • Box Plot Representation: Helps visualize differences; group B higher than C and A.

Conducting ANOVA (Calculation Overview)

  • Total Variation: Variance calculation similar to previous modules, using an overall mean for all observations (150 students).

  • Partitioning Total Variation:

    • Between-group Sum of Squares (SSB)

    • Within-group Sum of Squares (SSW)

  • Degrees of Freedom: Different for each sum of squares component (calculated depending on group sizes).

  • Conversion to Mean Squares:

    • Total mean square = Total SS / (n-1)

    • Between Mean Square = SSB / (k-1)

    • Within Mean Square = SSW / (n-k)

Statistical Ratio in ANOVA

  • F-statistic: Ratio of between-group variation to within-group variation (F = SSB/MSW).

  • Indicates the strength of the difference between groups; larger values suggest significant group differences.

  • Distribution: F-statistic follows an F-distribution, dependent on degrees of freedom from SSB and SSW.

Results Interpretation

  • Example yields F-statistic (e.g., 18.