LEC 11: CH 15 ANOVA

0.0(0)

Studied by 0 people

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/17

Earn XP

Description and Tags

Comparing the means of more than two groups

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

18 Terms

New cards

Steps for comparing multiple groups

New cards

Analysis of variance (ANOVA)

Similar concept to t-test: compares means of numerical variables for data grouped by a categorical variable (factor)

• However, it can compare two or more groups (unlike a t-test, which only compares 2 groups)

• It tests whether all the groups have the same mean (or not)

New cards

An ANOVA partitions the variance in the data

Total variation = variation among groups + variation within groups

New cards

how to interpret this chart

omponent	What It Means
Source of Variation	What’s causing the differences? Two sources: Groups and Error
Sum of Squares (SS)	Measures the variability: – SS<sub>groups</sub>: variation between group means – SS<sub>error</sub>: variation within each group
Degrees of Freedom (df)	Number of values free to vary: – df<sub>groups</sub> = k − 1 – df<sub>error</sub> = N − k
Mean Squares (MS)	Just the average variation: – MS = SS ÷ df
F-ratio	F = MS<sub>groups</sub> ÷ MS<sub>error</sub> It tells you whether between-group variability is much bigger than within-group variability
p-value (P)	Tells you if the F-ratio is large enough to be statistically significant (i.e., likely due to real differences rather than chance)

📐 How to Interpret Results

If p < 0.05 → reject the null hypothesis:
✅ At least one group mean is significantly different.
If p ≥ 0.05 → fail to reject the null:
❌ No evidence of a difference in means

<table style="min-width: 50px"><colgroup><col style="min-width: 25px"><col style="min-width: 25px"></colgroup><tbody><tr><th colspan="1" rowspan="1"><p><strong>omponent</strong></p></th><th colspan="1" rowspan="1"><p><strong>What It Means</strong></p></th></tr><tr><td colspan="1" rowspan="1"><p><strong>Source of Variation</strong></p></td><td colspan="1" rowspan="1"><p>What’s causing the differences? Two sources: <strong>Groups</strong> and <strong>Error</strong></p></td></tr><tr><td colspan="1" rowspan="1"><p><strong>Sum of Squares (SS)</strong></p></td><td colspan="1" rowspan="1"><p>Measures the <strong>variability</strong>:<br>– SS<sub>groups</sub>: variation <strong>between</strong> group means<br>– SS<sub>error</sub>: variation <strong>within</strong> each group</p></td></tr><tr><td colspan="1" rowspan="1"><p><strong>Degrees of Freedom (df)</strong></p></td><td colspan="1" rowspan="1"><p>Number of values free to vary:<br>– df<sub>groups</sub> = k − 1<br>– df<sub>error</sub> = N − k</p></td></tr><tr><td colspan="1" rowspan="1"><p><strong>Mean Squares (MS)</strong></p></td><td colspan="1" rowspan="1"><p>Just the average variation:<br>– MS = SS ÷ df</p></td></tr><tr><td colspan="1" rowspan="1"><p><strong>F-ratio</strong></p></td><td colspan="1" rowspan="1"><p>F = MS<sub>groups</sub> ÷ MS<sub>error</sub><br>It tells you whether <strong>between-group</strong> variability is much bigger than <strong>within-group</strong> variability</p></td></tr><tr><td colspan="1" rowspan="1"><p><strong>p-value (P)</strong></p></td><td colspan="1" rowspan="1"><p>Tells you if the F-ratio is large enough to be <strong>statistically significant</strong> (i.e., likely due to real differences rather than chance)</p></td></tr></tbody></table><div data-type="horizontalRule"><hr></div><p> <span data-name="triangular_ruler" data-type="emoji">📐</span> <strong>How to Interpret Results</strong></p><ul><li><p>If <strong>p < 0.05</strong> → reject the null hypothesis:<br><span data-name="check_mark_button" data-type="emoji">✅</span> <strong>At least one group mean is significantly different.</strong></p></li><li><p>If <strong>p ≥ 0.05</strong> → fail to reject the null:<br><span data-name="cross_mark" data-type="emoji">❌</span> <strong>No evidence of a difference in means</strong></p></li></ul><p></p>

New cards

Null distribution of the F statistic

New cards

How to read an F-distribution table

📖 How to Read It Step-by-Step 1. Know Your Two Degrees of Freedom (df):

In ANOVA, you need:

df₁ (numerator): between-groups degrees of freedom → usually k - 1
df₂ (denominator): within-groups degrees of freedom → usually N - k

📝 Example: If you're comparing 3 groups (k = 3) with a total of 30 subjects (N = 30):

df₁ = 3 - 1 = 2
df₂ = 30 - 3 = 27

2. Decide on Your Significance Level (α):

Usually:

α = 0.05 (most common)
α = 0.01 (stricter)

You will use the column for your α-level.

3. Find Your Critical F-value in the Table:

Go to the row for df₁ (numerator)
Go across to the column for df₂ (denominator)
Look under the correct α (usually a separate table or section)

✅ This value is your critical F-value

4. Compare Your Calculated F-statistic:

If...	Then...
F<sub>calculated</sub> > F<sub>critical</sub>	✅ Reject H₀ → Significant difference
F<sub>calculated</sub> < F<sub>critical</sub>	❌ Fail to reject H₀ → Not significan

<p><span data-name="book" data-type="emoji">📖</span> How to Read It Step-by-Step 1. <strong>Know Your Two Degrees of Freedom (df):</strong> </p><p>In ANOVA, you need:</p><p> </p><ul><li><p><strong>df₁</strong> (numerator): between-groups degrees of freedom → usually <strong>k - 1</strong></p></li><li><p><strong>df₂</strong> (denominator): within-groups degrees of freedom → usually <strong>N - k</strong></p></li></ul><p> </p><p><span data-name="memo" data-type="emoji">📝</span> Example: If you're comparing 3 groups (k = 3) with a total of 30 subjects (N = 30):</p><p> </p><ul><li><p>df₁ = 3 - 1 = 2</p></li><li><p>df₂ = 30 - 3 = 27</p></li></ul><p> </p><div data-type="horizontalRule"><hr></div><p> 2. <strong>Decide on Your Significance Level (α):</strong> </p><p>Usually:</p><p> </p><ul><li><p>α = 0.05 (most common)</p></li><li><p>α = 0.01 (stricter)</p></li></ul><p> </p><p>You will use the <strong>column</strong> for your α-level.</p><p> </p><div data-type="horizontalRule"><hr></div><p> 3. <strong>Find Your Critical F-value in the Table:</strong> </p><ul><li><p>Go to the <strong>row for df₁</strong> (numerator)</p></li><li><p>Go across to the <strong>column for df₂</strong> (denominator)</p></li><li><p>Look under the correct <strong>α</strong> (usually a separate table or section)</p></li></ul><p> </p><p><span data-name="check_mark_button" data-type="emoji">✅</span> This value is your <strong>critical F-value</strong></p><p> </p><div data-type="horizontalRule"><hr></div><p> 4. <strong>Compare Your Calculated F-statistic:</strong> </p><table style="min-width: 50px"><colgroup><col style="min-width: 25px"><col style="min-width: 25px"></colgroup><tbody><tr><th colspan="1" rowspan="1"><p>If...</p></th><th colspan="1" rowspan="1"><p>Then...</p></th></tr><tr><td colspan="1" rowspan="1"><p>F<sub>calculated</sub> > F<sub>critical</sub></p></td><td colspan="1" rowspan="1"><p><span data-name="check_mark_button" data-type="emoji">✅</span> Reject H₀ → Significant difference</p></td></tr><tr><td colspan="1" rowspan="1"><p>F<sub>calculated</sub> < F<sub>critical</sub></p></td><td colspan="1" rowspan="1"><p><span data-name="cross_mark" data-type="emoji">❌</span> Fail to reject H₀ → Not significan</p></td></tr></tbody></table><p></p>

New cards

Conclusion from the ANOVA

H0 : 1 = 2 = 3 (upside down n)

HA : at least one mean (i) is different P < 0.01 Therefore, we reject the null hypothesis At least one of the group means is significantly different from another

New cards

Variation explained: R 2 (“R-squared”)

ssgroups = sum of sqrs

New cards

SUMMARY OF HOW ANOVA WORKS

Example of an ANOVA when there is no significant difference between the group means

• F-ratio is small (expected value ≈ 1.0 under null hypothesis that mean of all groups is the same). •

R 2 is small (groups only explains a small amount of the overall variation in data)

• Large area to the right of the Fdistribution curve (P > 0.05).

<p>Example of an ANOVA when there is no significant difference between the group means</p><p> • F-ratio is small (expected value ≈ 1.0 under null hypothesis that mean of all groups is the same). •</p><p> R 2 is small (groups only explains a small amount of the overall variation in data) </p><p>• Large area to the right of the Fdistribution curve (P > 0.05).</p>

New cards

The Tukey-Kramer test

Performed only if ANOVA shows a significant difference among group means

• Compares each of the group means against each other (pairwise comparisons)

New cards

When to use tukey kramer

You run an ANOVA
The ANOVA p-value is significant (p < 0.05)

This test is appropriate when:

You’re comparing 3 or more groups
You want to compare every possible pair of group means
The sample sizes may be unequal (Tukey-Kramer adjusts for this)

New cards

Purpose of tukey-kramer

n R or output tables, you’ll usually see:

A mean difference for each group pair (e.g., A vs. B)
A confidence interval
A p-value

➤ Interpretation rules:

Output	Meaning
p < 0.05	✅ The two group means are significantly different
p ≥ 0.05	❌ No significant difference between those two groups
CI does not include 0	✅ Significant difference
CI includes 0	❌ Not significant

New cards

📊 Example Scenario: tukey

📊 Example Scenario:

You test 3 diets: A, B, and C
ANOVA p-value = 0.003 → Significant → You do Tukey-Kramer

Tukey-Kramer results:

Group Pair	Mean Difference	p-value	95% CI
A vs. B	5.2	0.01	[1.2, 9.2]
A vs. C	1.1	0.45	[-2.0, 4.2]
B vs. C	4.1	0.03	[0.4, 7.8]

Interpretation:

✅ A vs. B and B vs. C are significantly different
❌ A vs. C is not significantly differen

New cards

Assumptions of ANOVA

(1) Random samples

(2) Normal distributions for each population

(3) Equal variances for all populations

New cards

Meeting the assumptions of ANOVA

normaility can be assessed using shapiro wilk, if not norm distrib, data may meet assumptions
equal variance can be assessed using levene test
if assumptions cannot be met, a kruskal-wallis test can be used

New cards

Kruskal-Wallis test

Purpose: Compare 3 or more independent groups when data is not normal or assumptions for ANOVA are violated
Data Used: Ranks (not raw values); works with ordinal or non-normal data
H₀: All group medians are equal
Hₐ: At least one median differs
Interpretation:
- p < 0.05 → Significant → At least one group differs
- p ≥ 0.05 → Not significant → No evidence of difference