Repeated-Measures Designs and Analyses
Between-Subjects vs. Within-Subjects Designs
Between-Subjects Design:
- Also known as an independent subjects design.
- Different groups of people are assigned to different levels of the independent variable.
- Comparisons are made between different groups.
- Example: Exposing 100 people to the experimental condition requires 200 subjects (100 for experimental, 100 for control).
Within-Subjects Design:
- The same group is assessed multiple times under different conditions.
- The dependent variable is measured multiple times with manipulation of an independent variable in between.
- Measurements can be separated by time, a procedure, or a life event.
Advantages of Within-Subjects Designs
Increased Statistical Power:
- Fewer subjects are needed since individuals are compared to themselves.
- Eliminates the problem of variability due to different people across groups.
Mixed Design:
- Combines within-subjects and between-subjects factors.
Schematic of Between-Subjects Design
- Two separate groups of people (naturally occurring or randomly assigned).
- Manipulation of the independent variable.
- One group receives the experimental level, and the other receives the control level.
- Observation and measurement of the dependent variable in both groups.
- Hypothesis testing via differences between groups.
- Requires a large number of subjects.
ANOVA and the F Ratio (Between-Subjects)
- The F ratio compares two estimates of population variance:
- Variance based on differences between group means (systematic variation + chance variation).
- Variance based on average variances within each group.
- If the study does not work, both estimates reflect random variation.
- If there's an effect, systematic variation is included in the numerator, increasing the F ratio.
- Systematic variation depends on the variability across different groups and averages.
- Large mean differences lead to a large systematic variance component.
Within-Subjects Design Schematic
- A single group of people.
- All participants receive the control level of the manipulation, followed by observation.
- Experimental treatment is then administered.
- The same group is measured again on the same dependent variable.
- Hypotheses are tested by looking at differences between conditions within the same group.
Statistical Power in Within-Subjects Designs
- Increased power; fewer subjects needed.
- Example: If 100 people are desired in the experimental group, only 100 participants are needed in total.
Repeated Measures ANOVA
- Greater sensitivity to treatment differences due to eliminating a major source of unsystematic error variance (different people in each condition).
- Each participant acts as their own experimental control.
- Systematic change due to the independent variable is more directly observable.
- Systematic error can be partialed out from unsystematic error, isolating systematic and error components.
- Less noise when looking for a signal.
- Evaluating systematic error requires restructuring the analysis with a new sums of squares calculation.
Sums of Squares Within Participant
- A new variance component that encapsulates how individuals vary across conditions.
- Formula is: where:
- is the score of participant in condition .
- is the mean score for participant across all conditions.
- is the number of participants.
- is the number of conditions.
- Formula is: where:
- Each participant has a mean across different conditions.
- Calculate differences between individual scores and their means, square the differences, and add them up across all participants.
- Degrees of freedom: , where is the number of participants and is the number of experimental conditions.
- Reflects the extent to which individuals vary across conditions and includes both systematic and unsystematic factors.
Between-Groups Differences in Within-Subjects Design
- Measures the systematic differences within people across conditions.
- Provides a cleaner measure of any effect that might be present.
- Sums of squares for the model are free of unsystematic error.
- Partialing out the sums of squares for the independent variable from the sums of squares within participant yields a new measure of unsystematic error variance.
Building a New Error Term
- Calculate sums of squares for the model as before but considering how people systematically change across those conditions.
- Partial out from that within-subjects sums of squares, within participant sums of squares, that between-group sums of squares, and that stuff that's left over now is going to become our residual term.
- Subtract between-group sums of squares from within-participant sums of squares to get the residual term.
- Degrees of freedom for error: degrees of freedom for within-participant minus degrees of freedom for the model.
Schematic Example: Systematic vs. Non-Systematic Variation
- Scenario 1: All Systematic Variation
- Participants A, B, and C score 1, 2, and 3 at times 1, 2, and 3, respectively.
- Condition means differ systematically across the three conditions.
- Scenario 2: All Non-Systematic Variation
- Participants A, B, and C score differently at times 1, 2, and 3 without a systematic pattern.
- Condition means are identical across all conditions.
Calculation of Sums of Squares
- Sums of squares within participant involves getting each person's mean, subtracting it from their scores across conditions, squaring the differences, and summing across participants.
- Sums of squares between groups involves taking condition means, getting the grand mean, subtracting the grand mean from condition means, and squaring and summing those differences.
Steps for Repeated Measures ANOVA
- Calculate sums of squares within participant.
- Calculate between-groups/conditions sums of squares (using condition means).
- Subtract the sums of squares for the model from the sums of squares within participant to get the new error term.
- Divide sums of squares by corresponding degrees of freedom.
- Divide effect mean squares by error mean squares to get F ratios.
Example: Aid Attribution Based on Claimant Need
- Participants read information about claimants in need of a donor organ.
- Reasons for need fell into four categories:
- Internal Controllable: the person did something under their control to cause their need.
- Example: Despite his doctor's repeated warnings about the damaging effects on his health and the probability of severe organ damage, this person continued to eat high cholesterol foods, smoke, and not exercise. As a result, he now has severe organ failure.
- Internal Uncontrollable: a genetically defective organ caused the need.
- External Controllable: an employer knowingly exposed the patient to a chemical.
- External Uncontrollable: the person took a medication with an unknown side effect that caused organ failure.
- Internal Controllable: the person did something under their control to cause their need.
- Participants rated the deservingness of each claimant on a scale (1-7).
- Expected outcome: Internal controllable claimants are seen as least deserving, and external uncontrollable claimants are seen as most deserving.
ANOVA Output Interpretation
- Higher scores reflect greater perceptions of deservingness.
- Differences across conditions may be relatively small but meaningful.
Sphericity
- The sphericity assumption says the total variation now is including a lot of factors here, and we've again now got a cleaner measure of our non-systematic variation, so we can more effectively isolate that effect and look at the size of it.
- Similar to homogeneity of variance in between-subjects design.
- Refers to the equality of variances of the differences between treatment levels.
- If sphericity holds, variability across people is relatively uniform across levels of the independent variable.
Assessing Sphericity
- SPSS automatically evaluates sphericity and provides results if it has been violated.
- Formula is:
- The Mauchly Test helps decide whether sphericity assumption is met.
- You do not want this test to be significant; significance indicates that differences across groups and variances in differences across conditions were significantly different.
- If the assumption is not met, be cautious.
- Tests like Greenhouse-Geisser correct degrees of freedom.
What To Do if Sphericity Is Violated
- If the test for sphericity in SPSS comes back significant, researchers often use Greenhouse-Geisser, Huynh-Feldt, or lower bound corrections.
- These adjust the degrees of freedom and mean squares to account for the violation.
Contrasts and Repeated Measures Designs
- Statistical significance of contrasts is particularly sensitive to sphericity.
- If the sphericity assumption is violated, use caution when doing follow-up analyses, and consider using an omnibus error term.
- The problem could be that it will be too liberal for some comparisons and too conservative for others, so it's reflecting all of that underlying variability across those conditions.
Effect Size: Partial Eta Squared
- Partial eta squared is typically reported rather than eta squared.
- This helps cut through factors by isolating the systematic portions.
Carryover Effects
- Ensure there's no possibility of carryover effects in the study design (when the effect of one condition carries over to other conditions).
- Test by manipulating a key variable within subjects and counterbalancing the order of tasks.
- Additional ways include systematically varying or randomly presenting different stimuli in different random orders.