BioStat Notes ( Wednesday )
ANOVA Notes: Between- vs Within-Treatment Variation (MRUS example)
Purpose: Compare differences among three treatments (between-treatment variation) rather than variability within each treatment (within-treatment variation).
Central question: Which source of variation is more informative for detecting treatment effects? Answer: between treatments.
Intuition: Large between-treatment variation suggests treatment means are far from the grand mean, indicating treatment effects; large within-treatment variation suggests observations within each treatment are dispersed, reducing our ability to detect differences.
Framework: ANOVA (Analysis of Variance) decomposition of total variability into between- and within-treatment components, leading to an F-test.
Context note: Earlier slides referenced the synchrony regression model and brute-force decomposition; this follows the same SS decomposition logic applied to treatment groups.
Key concepts and formulas
Grand mean:
ar{y}_{..} = rac{1}{N}\sum{i=1}^k\sum{j=1}^{ni} y{ij}
Treatment means:
ar{y}{i.} = \frac{1}{ni}\sum{j=1}^{ni} y_{ij},\quad i=1,\dots, kTotal Sum of Squares (SST): measures overall variability around the grand mean
ext{SST} = \sum{i=1}^k\sum{j=1}^{ni} (y{ij} - \bar{y}_{..})^2Within-Group (Error) Sum of Squares (SSW): variability within each treatment
ext{SSW} = \sum{i=1}^k\sum{j=1}^{ni} (y{ij} - \bar{y}_{i.})^2Between-Group (Treatment) Sum of Squares (SSB): variability of treatment means around the grand mean
ext{SSB} = \sum{i=1}^k ni\, (\bar{y}{i.} - \bar{y}{..})^2Fundamental identity (ANOVA decomposition):
ext{SST} = ext{SSB} + ext{SSW}Number of observations and groups:
Total observations: N = \sum{i=1}^k ni
Number of treatments (groups): k
Degrees of freedom (df):
Total df: ext{df}_{T} = N - 1
Between-treatment df: ext{df}_{B} = k - 1
Within-treatment df: ext{df}_{W} = N - k
Mean Squares (MS):
ext{MS}{B} = \frac{ ext{SSB}}{\text{df}{B}} = \frac{ ext{SSB}}{k-1}
ext{MS}{W} = \frac{ ext{SSW}}{\text{df}{W}} = \frac{\text{SSW}}{N-k}F-statistic (testing H0: all μi are equal): F = \frac{\text{MS}{B}}{\text{MS}_{W}}
F-distribution with degrees of freedom $(\text{df}{B}, \text{df}{W}) = (k-1, N-k)$
Decision rule: Reject H0 if the observed F exceeds the critical value from the F_{k-1,N-k} distribution at the chosen significance level.
Worked example (given numbers)
Sample size and groups:
Total observations: N = 21
Treatments (groups): k = 3
Degrees of freedom: ext{df}{T} = 20, \; \text{df}{B} = 2, \; \text{df}_{W} = 18
Sum of Squares (numerical values provided in the transcript):
Total Sum of Squares (SST): ext{SST} = 2177
Between-treatments SS (SSB) and Within-treatments SS (SSW) are computed from data (not explicitly listed in the transcript) but are used to form MSB and MSW.
Backed by SAS output: MS between and MS within are used to form F = MSB / MSW, which follows an F distribution with (2,18) df.
Critical value and conclusion:
F critical at the chosen level: approximately F_{2,18}^{\alpha} = 3.55 (from the transcript).
Observed F statistic exceeds the critical value (as stated in the transcript), leading to rejection of the null hypothesis that all treatment means are equal.
Therefore, there is a significant difference among at least two treatment means for the MRUS response variable.
Understanding the intuition behind the F-test in this context
If between-treatment variation is large, the F statistic increases, making it more likely to reject H0.
If within-treatment variation is large, it can mask between-treatment differences, lowering the F statistic.
The F-statistic compares the density of the between-group signal (how much group means differ) to the noise within groups (how spread out observations are within each treatment).
In formula form, larger MSB relative to MSW drives larger F and stronger evidence against H0.
Connecting to practical steps you’d take (hand calculations and software)
Hand calculation outline (as described in the transcript):
1) Compute grand mean \bar{y}{..} and treatment means \bar{y}{i.}.
2) Compute SST, SSB, SSW using the formulas above.
3) Compute degrees of freedom: dfT = N-1, dfB = k-1, dfW = N-k. 4) Compute MSB = SSB/(k-1) and MSW = SSW/(N-k). 5) Compute F = MSB / MSW and compare to F{k-1,N-k}.Software workflow (as described): SAS GLM (or similar) analysis
Data format: two columns: treatment indicator (0,1,2 for three treatments) and response (e.g., MRUS difference).
PROC: GLM or similar to model response ~ treatment.
Output includes:
Total observations: N
Levels of treatment: k
Model (between-treatment) SS, error (within) SS, and total SS
Degrees of freedom for model and error
Mean squares, F-statistic, and p-value
Interpretation from SAS output:
There is an overall significant difference among the three treatments for the MRUS response.
The ANOVA test tells you only that at least two treatments differ; it does not specify which pairs.
Interpreting treatment means and their significance (post-ANOVA insights)
Treatment means and standard errors are reported to identify which treatments differ.
P-values for each treatment mean test (t-tests) indicate whether a given treatment effect is significantly different from zero (nonzero effect).
Findings from the transcript:
Treatment 0 (selective shunt) did not show a statistically significant effect (p-value not small).
Treatments 1 and 2 (non-selective shunts) showed very small p-values, indicating significant nonzero treatment effects.
Practical interpretation of the MRUS response sign:
Positive MRUS difference means MRUS increased after the treatment.
Negative MRUS difference means MRUS decreased after the treatment.
Specific pattern observed in the example:
Non-selective shunts (treatments 1 and 2) are associated with a decrease in MRUS after surgery (negative mean differences) and are statistically significant.
The strong selective shunt (treatment 0) did not show a significant change in MRUS.
Key takeaways and connections
Conceptual takeaway: In ANOVA with multiple groups, the between-treatment variation is the primary driver for detecting differences across treatments; large between-group differences relative to within-group variability leads to significant results.
Relationship to prior material: This builds on the same SS decomposition framework discussed in earlier lectures (e.g., the synchrony regression model and the brute-force decomposition) and mirrors the familiar ANOVA structure: SST = SSB + SSW with dfT = N-1, dfB = k-1, df_W = N-k.
Practical implications: The SAS/GLM output provides a convenient, direct way to obtain SS, MS, F, and p-values; the treatment means with their standard errors help interpret which groups differ and in what direction the effect lies, aiding clinical or practical recommendations.
Ethical, philosophical, and practical implications
Statistical significance does not imply clinical significance; examine effect sizes (means, differences, and confidence intervals) to assess practical impact.
Report all three components: the overall test (ANOVA) and post-hoc or treated-means interpretations to avoid overstating results.
Be transparent about assumptions: independence, normality, and homogeneity of variances; violations may affect F-test validity and may require alternative methods.