Module 7: Repeated-Measures ANOVA

Partitioning Variance

The total variance in a sample, denoted as SS Total, is composed of variance between conditions and variance within conditions. The variance between conditions (SS Between) reflects the effect of the experimental manipulation. The remaining variance within conditions (SS Within) is further divided into variance due to individual differences (SS Subjects) and random error (SS Error).

In a repeated measures design, individual differences are controlled for, since the same participants take part in all conditions. As a result, SS Subjects is removed from the error term, reducing the amount of unrelated or "bad" variability. This leads to greater statistical power: with less error, it's easier to detect a true effect and harder to mistakenly find an effect when none exists.

Sums of Squares in Repeated Measures ANOVA

1. SS Total – Total variability in all scores

What it measures: How much all scores vary from the grand mean.
Formula:

SSTotal=∑(Xij-X..)^2

Steps:

Find the grand mean (average of all scores).
Subtract it from each score, square the result.
Add all squared differences.

2. SS Between (Conditions) – Variability due to the IV

What it measures: How much the condition means deviate from the grand mean, weighted by number of participants.
Formula:

SSBetween=∑n(Xi.-X..)2

Steps:

Calculate the mean for each condition.
Subtract the grand mean, square the result.
Multiply each squared difference by the number of participants (n).
Add them all up.

3. SS Subjects – Variability due to individual participants

What it measures: How much each participant’s average differs from the grand mean, weighted by number of conditions.
Formula:

SSSubjects=∑k(X.j-X..)2

Steps:

Calculate the average for each participant.
Subtract the grand mean, square the result.
Multiply each squared difference by the number of conditions (k).
Add them all up.

4. SS Within – Remaining variation not due to condition effect

What it measures: All variability not explained by the IV.
Formula:

SS Within =SS Total −SS Between

Steps:

Subtract SS Between from SS Total.

5. SS Error – Residual noise

What it measures: What’s left after accounting for both the condition and participant effects.
Formula:

SS Error =SS Within −SS Subjects

Steps:

Subtract SS Subjects from SS Within.

Degrees of Freedom

1. df Total

This represents the total number of independent pieces of information (or data points) you have to estimate variance.

Formula:

df Total =N−1

Where:

N = total number of observations (all participants × all conditions)

2. df Between (Conditions)

This represents the number of levels of the independent variable (conditions) minus 1.

Formula:

df Between =k−1

Where:

k = number of conditions (groups)

3. df Subjects

This represents the number of participants minus 1.

Formula:

df Subjects =n−1

Where:

n = number of participants

4. df Within

This is the within degrees of freedom and represents the remaining variability after accounting for between variability.

Formula:

dfWithin=N-k

5. df Within

This is the error degrees of freedom and represents the remaining variability after accounting for both between and subjects variability.

Formula:

dfErr=dfWi\operatorname{th}in-dfSubjects

Mean Squares and F-Ratio

1. Mean Square Between (MS Between)

This is the variance due to differences between the conditions (IV).
Formula:

MSBetween=\frac{SSBetween}{dfBetween}

Where:

SSBetween = Sum of squares between conditions
dfBetween = k−1, where k is the number of conditions

2. Mean Square Subjects (MS Subjects)

This is the variance due to individual differences between participants.
Formula:

MSSubjects=\frac{SSSubjects}{dfSubjects}

Where:

SS Subjects = Sum of squares due to subjects
dfSubjects=n−1, where n is the number of participants

3. Mean Square Within (MS Within/Error)

This is the variance that is unexplained by the model (residual error).
Formula:

MSWi\operatorname{th}in=\frac{SSErr}{dfWi\operatorname{th}in}

Where:

SSError = Sum of squares error (within)
dfWithin= (n−1) × (k−1), where n is the number of participants and k is the number of conditions

4. F-ratio

This is the ratio used to determine if the observed differences between conditions are statistically significant.

Formula:

F=\frac{MSBetween}{MSWi\operatorname{th}in}

Where:

MSBetween = Mean square between (from above)
MSWithin = Mean square within (from above)

Mean Squares are calculated to eliminate the bias associated with the number of scores used to calculate the Sum of Squares. Because the SS calculations are based on summed values, they are influenced by how many scores are included in the sum — the more scores, the larger the SS will be, even if the actual variability is the same. To standardise these SS values and make them comparable, we divide each SS by its degrees of freedom. The average SS is what we call MS.

Assumptions for RM ANOVA

Sphericity: assumes that the relationship between pairs of experimental conditions is similar, or, that the level of dependence between experimental conditions is roughly equal

Falls under the banner of Compound Symmetry
Compound Symmetry occurs when both the variances across conditions are equal, and when the covariances between pairs of conditions are equal
Compound Symmetry assumes that the variation within experimental conditions is fairly similar and that no two conditions are any more dependent or related that any other two

While compound symmetry is not a required condition for the one-way RM ANOVA, sphericity is.

SPSS tests the severity of departures from sphericity using Mauchly’s test

Tests the hypothesis that the variances of the differences between conditions are equal
If the test statistic significant, the assumption has been violated
This test is affected by sample sizes: in very big samples exaggerates small deviations, in very small samples downplays big deviations

If Mauchly’s Test is violated (sphericity assumption broken):

1. Greenhouse-Geisser (GG)

Adjusts df downward using ε
More conservative (safe for strong violations)
Use if ε < 0.75

2. Huynh-Feldt (HF)

Also adjusts df with ε
Less conservative, more power
Use if ε > 0.75

3. Friedman Test (non-parametric alt)

Use if violation is extreme