LECTURE NOTES - Week 9

One-Way ANOVA

ANOVA (Analysis of Variance)

Same question as t-test: differences between means of groups
But for more than 2 groups
- (as well as other complexities; next week)
Very different method than the t-test
- But still comparing means of groups, estimating population parameters
Hypothesis:
- There is a difference between (some of) the groups
Null hypothesis:
- There is no difference between any of the groups
- H₀ : μ₁ = μ₂ = μ₃ = μ₄…
There could be several alternate hypotheses
- Will need to test for those separately

An Example

Which basketball team is best?
- Comparing 4 teams: WLU, UW, UoG, UWO
- Using a free-throw contest:
  - Each player throws 50 times, count the successful throws
- Each team has 15 players
Null hypothesis:
- The mean free-throw success rate is the same for all 4 teams
- H₀ : μ_WLU = μ_UW = μ_UoG = μ_UWO
The ANOVA will tell us if there are any differences
- But not which teams are different from which
- Will need to test that separately, later

Assumptions

Same assumption as the t-test:
- Normal distribution of values
  - Within each team, the scores are normally distributed
  - This is a parametric test (it makes assumptions about the population parameters)
- Homogeneity of variance
  - The variance of cores for each team is about the same
- Independent measures
  - The measurements are all independent
  - E.g: UW and WLU don’t share a coach - not independent
ANOVA is generally quite robust to violations of these
- Still gives you meaningful answers

The Logic of ANOVA

We want to know if (any of) the groups differ on their means
- We assume equal(ish) variances and normality
We use sample statistics to estimate population parameters
- We will be estimating the variance (of the population)
- 2 methods:
  - Use the variance within each group as an estimate (average all groups)
  - Use the variance of the means of the groups as an estimate
If the null hypothesis is true:
- Both methods should give about the same answer
If the null hypothesis is false:
- The means will differ, so the variance in means will be larger
- If the second method gives a larger result, we reject the null

Process (In Theory)

We want to estimate the population variance, σ²
Method 1: using the variance within each groups
- We have the variance of the groups =, for group j
- We can pool the variances by averaging
  - Assuming equal sample size
- Call this Mean Square Error (MS_error, MS_e)
Method 2: using the means of the groups
- We have the means of the groups =, for group j
- Variance of means = variance of sampling distribution of mean = SEM²
  - [ss_m = variance of the means]
- Call this Mean Square Groups (MS_groups, MS_g)
Then we compare these two estimates

Another Approach

There is a lot of variability in the data
- The scores are not all the same
Variability can come from 2 sources:
- Random differences between people (within a group)
  - We call that error
- Systematic differences between groups
ANOVA partitions the variance
- How much of it comes group error (related to MS_error)
- How much of it comes from group differences (related to MS_groups)
If a lot comes from the groups
- The null hypothesis is probably false

Actual Calculations

To get MS_error & MS_groups, we use sums of squares (SS)
- Like we did for calculating variance
- SS = sum of squared deviations from the mean
- Easier to work with
If H₀ is true, there is only one ‘real’ mean
- Grand mean (gm); estimate by average of group means
Steps:
- Calculate SS for all data, from grand mean = SS_total
- Calculate SS of each group mean from grand mean = SS_group
  - Same idea as MS_groups
  - Each mean is multiplied by group’s n (same reason as MS_group)
- Find SS_error (same idea as MS_error)
  - By subtracting SS_group from SS_total

Final Step

We want to know if MS_groups and MS_error are similar
Take the ratio:
- F = MS_groups/MS_error
If F is large:
- Suggests more variance between the groups
- Suggests H₀ is false
Compare F ratio to its distribution
- F ratio has two df, one from the groups and one from the error
- df1 = number of groups - 1
- df2 = number of measurements - number of groups
  - = number of groups* (n - 1)
  - (we lose one df in each group)

Back to Basketball

Are the teams different?
Get the SS:
- Grand mean = 24.7
- SS_total = 6596.6 (sum of squared difference from gm, for all data)
- SS_group = 3221.4 (sum of squared difference of group mean from gm, *n)
- SS_error = 6596.6 - 3221.4 = 3375.2
Find the df:
- 4 groups → df1 = 3
- 15 in each group → df2 = 4(14) = 56
Find the F-ratio (MS_g/MS_e)
- F(3,56) = (3221.4/3)/(3375.2/56) = 1073.8/60.27 = 17.8
- Compare to F-table: P = 0.00000003
P < 0.05 = reject null: teams are different

Post-Hoc Tests

Ok, null is false
- Which teams differ from which?
- Could be all different, could be only WLU vs. UWO…
Need to compare within each pair of teams
- Pairwise comparisons
- Post-hoc tests (we do them after the ANOVA)
Could just do an independent-samples t-test on each pair
- Each time we do a t-test, we have a 5% chance of Type I error
- If we do lots of comparisons, chances increase
  - With 4 teams, we have 6 comparisons
- We need to control the familywise error rate
- Bonferroni correction:
  - Adjust the criterion, a, so that the overall Type I error rate = 0.05
  - With6 comparisons, a = 0.05/6 = 0.008

Running Post-Hocs

Remember:
- T scores have a problem:
- Need to estimate population variance
- We use sample variances
  - Sometimes pooled from several groups
- Have already done this: MS_error is our best estimate of σ²
- Calculate t-score using MS_error instead of s²
Run independent-samples t-tests on pairs of groups
Compare t to our adjusted criterion
- Based on the Bonferroni correction
Textbook recommends LSD test
- Very few people use that

Planning Comparisons

Goal of statistics is to find meaningful effects in the world
We don’t want to just run math on everything
We need to focus our questions
- What do we really care about?
- Maybe: only if WLU is different from (any of) the other teams
- We can run just those comparisons, not all
- Won’t increase our familywise error rate as much
  - We can adjust a by less
  - E.g.: WLU only = 3 comparisons
  - a = 0.05/3 = 0.017
- Planned comparisons
Need to understand what we are doing and why

Effect Sizes

Can use Cohen’s d
- On each pair of groups separately
- Replace the denominator with, better estimate of SD
Can use another measure
- ANOVA partitions the variance
- How much is differences between groups (SS_group)
- How much is differences within groups (SS_error)
- We care about the proportion of the total variance (SS_total) that is due to groups
- (greek letter eta)
- Or use ω² (greek letter omega)
  - Formula not important, variation on η²
- JASP can do both