ANOVA: Stands for Analysis of Variance. It is a hypothesis test used to determine if there are differences between the means of three or more groups.
2 Sample t Tests: Hypothesis test for comparing mean differences between two independent groups.
Equal Variability: Use pooled t test.
Unequal Variability: Use non-pooled t test which accommodates unequal variances.
Scenario: Rosanna compares average mileage of two gasoline brands.
She takes 4 samples from each brand, records distance traveled, and calculates mean and standard deviation.
Null Hypothesis (Ho): µ₁ = µ₂ (no difference in means).
Alternative Hypothesis (Ha): µ₁ ≠ µ₂ (there is a difference).
Assumptions Check:
Independent Populations: Brands are independent groups.
Independent Observations: Random sampling ensures independence.
Normal Distribution: Distances follow a normal distribution.
Equal Variability: Check the ratio of standard deviations (0.66 / 0.44 < 2 confirms equal variances).
Test Statistic Calculation:
Use the formula:[ t_0 = \frac{\bar{y}_1 - \bar{y}_2 - d_0}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}} ]
Values: ( \bar{y}_1=16, \bar{y}_2=18, d_0=0, n_1=n_2=4 )
Calculate pooled standard deviation (s_p) using:[ s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2}} ]
Final t statistic value concluded near -5.0529.
P-value Calculation:
Two-tailed test, find area in both tails under t distribution (degrees of freedom = 6):
Results in p-value = 0.002327, which is less than ( \alpha = 0.05 ).
Conclusion: Reject null hypothesis. There's significant evidence of a difference in average mileage.
Comparing more than 2 groups leads to increased probability of Type I error when multiple t-tests are performed.
ANOVA is introduced to handle comparisons of two or more groups efficiently.
Subscripts (1, 2,...,k) denote different groups.
k: Total number of groups.
N: Total number of observations across all groups.
Purpose: To compare means across multiple groups.
Hypotheses:
Null Hypothesis (Ho): All group means are equal (µ₁ = µ₂ = ... = µ_k).
Alternative Hypothesis (Ha): Not all means are equal. At least one mean is different.
Assumptions: Independence, normality of distribution, and equal variances among groups.
Calculation Process:
Calculate Test Statistic: Using F ratio of variances between groups to within groups.
Calculate P-value: From the F-distribution.
Decision: Reject or do not reject Ho based on P-value compared to alpha level.
Scenario: Comparing 3 brands of gasoline.
Sample Calculations: Average and standard deviations given for each brand.
Two scenarios are presented: Similar means but differing variances indicate the need for ANOVA.
Data Visualization: Dot plots of data show similarities in means but differences in spread/variability, emphasizing the importance of ANOVA.
Types of Variability for ANOVA:
Within Groups Variability (Error Mean Square, MSE): Assessment of variation within each group.
MSE = SSE / (N - k)
Between Groups Variability (Treatment Mean Square, MST): Assessment of variation among group means.
MST = SST / (k - 1)
F-ratio: [ F = \frac{MST}{MSE} ]
Distribution Characteristics: F-distribution characterized by two degrees of freedom (df1 and df2).
Columns in ANOVA Table:
Source of variability (Between vs. Within)
Degrees of Freedom (df)
Sum of Squares (SS)
Mean Squares (MS) = SS / df
F-statistic = MST / MSE
Key Observations: Relationship among variances helps draw conclusions regarding group differences.
ANOVA provides a robust method for comparing means across multiple groups efficiently without inflating the Type I error rate through multiple comparisons.
Provides clear framework for understanding statistical significance in broader contexts where multiple datasets are involved.