Two-Way Between-Subjects ANOVA Notes
Week 7: Two-Way Between-Subjects ANOVA
Learning Objectives
Appreciate the differences between one-way and two-way between-subjects ANOVA designs.
Identify the application of two-way between-subjects ANOVA and provide examples of research using this technique.
Interpret and formally report the results of two-way between-subjects ANOVA in SPSS.
Appreciate how to perform planned and unplanned follow-up tests.
Factorial Designs
ANOVA with more than one factor = factorial ANOVA.
Two-way ANOVA is an extension of one-way ANOVA with one additional factor (IV).
Two-way ANOVA involves 2 factors.
Three-way ANOVA involves 3 factors, and so on.
In factorial ANOVA, variables can interact; the effect of one factor may differ based on the level of the other factor.
Examples Using Two-Way Between-Subjects ANOVA
Example 1
DV = athletic performance
IV1 = age (young or old)
IV2 = training setting (novel or familiar)
This example represents a 2x2 between-subjects ANOVA.
Example 2
DV = weight loss %
IV1 = gender (males, females, or non-binary)
IV2 = weight loss intervention (online, in person, or control)
This example represents a 3x3 between-subjects ANOVA.
Example 3
DV = trust in healthcare professionals
IV1 = anxiety classification (mild, moderate, severe)
IV2 = depression classification (mild, moderate, severe)
This example represents a 3x3 between-subjects ANOVA
Two-Way Between-Subjects ANOVA: Published Examples
Example 1: Luxury Car Ownership and Attractiveness
Dunn, M.J., & Searle, R. (2010) studied the effect of luxury car ownership on attractiveness ratings.
Heterosexual participants rated the attractiveness of a driver of the opposite sex seated in either a Ford Fiesta or a Bentley Continental.
Design: 2x2 between-subjects ANOVA.
Example 2: Contact with Mental Illness and Stigma
Lee and Seo (2018) hypothesized that contact with mental illness is an effective anti-stigma strategy.
Participants' contact with mental illness was measured at three levels: personal, public, indirect.
Participants were randomly allocated to vignettes of different illnesses (alcoholism, depression, schizophrenia).
Dependent Variables: dangerousness (DV1) and social distance (DV2) associated with the illness.
Design: 3x3 between-subjects ANOVAs.
Recap: One-Way vs. Two-Way Between-Subjects ANOVA
One-Way ANOVA Example
30 students divided into three groups: morning (n=10), afternoon (n=10), and evening (n=10).
Research Question: Is there a main effect of time of day of learning on recall?
Two-Way ANOVA Extension
Adding caffeine as another factor:
30 students divided into groups considering both time of day and caffeine:
Morning: Caffeine, No Caffeine
Afternoon: Caffeine, No Caffeine
Evening: Caffeine, No Caffeine
Design: 3x2 design
Research Questions:
Main effect of time of day of learning on recall.
Main effect of caffeine on recall.
An interaction between time of day and caffeine.
Today’s Worked Example: 2x2 Between-Subjects ANOVA
Investigating the effects of training setting and age on running time
IV1 = Training setting (2 levels: Novel or Familiar)
IV2 = Age (2 levels: Young or Old)
DV = Running time (seconds)
Conditions:
Novel Young (n=3)
Familiar Young (n=3)
Novel Old (n=3)
Familiar Old (n=3)
Operationalizing the Variables:
Participants: Elite long-distance runners
DV: Time to run 100 meters (higher = slower)
IV1 = Training setting
Novel: Train in a new setting
Familiar: Train in a normal setting
IV2 = Age
Young: 18-25 years
Old: 25-35 years
Hypotheses
Main effect of age: Older long-distance runners would be faster at running 100 meters than younger long-distance runners.
Main effect of training setting: Runners training in a familiar environment would be faster at running 100 meters than runners training in a novel environment.
Interaction (age x training setting): There would be an interaction between age and training setting.
Visualizing the Data
Data points: Each represents the time taken to run 100 meters (the DV).
Total participants: 12
Factor 1 | Factor 2 | Young | Old |
|---|---|---|---|
Novel Training | Setting | 13.6 | 12.7 |
14.0 | 12.4 | ||
14.5 | 12.1 | ||
Familiar | 11.8 | 11.4 | |
11.7 | 12.1 | ||
12.1 | 11.9 |
Assumptions for ANOVA
The dependent variable (DV) is measured at the interval or ratio level.
The data are drawn from a population which is normally distributed.
There is homogeneity of variance (samples are drawn from populations with the same variance).
For independent groups designs, independent random samples must have been taken from each population.
Main Effects
Main effects look at the effect of each factor on its own (the independent effect of a factor).
In our example:
Main effect of training setting.
Main effect of age.
Main Effect – Training Setting
Compare the mean for one level of a factor with the mean of the other level(s).
Comparing the mean familiar training running time to the mean novel training running time
Main Effect – Age
Compare mean young running time to mean older running time
Consider how ‘young’ and ‘old’ were defined
Interactions
The combined effect of the factors.
Example: Young people training in the novel setting have high running times (i.e., are slower).
This is described as a 2-way interaction as it is between two factors.
Calculating Two-way ANOVA
Source | SS | df | MS | F | p
Factor A (Training setting) | SSA | dfA = a-1 | MSA = SSA / dfA | MSA / MSerror | From SPSS
Factor B (Age) | SSB | dfB = b-1 | MSB = SSB / dfB | MSB / MSerror | From SPSS
Factor A x Factor B | SSAxB | dfAxB = (a-1)(b-1) | MSAxB = SSAxB / dfAxB | MSAxB / MSerror | From SPSS
Error | SSerror | dferror = N-(a+b) or N-1| MSerror = SSerror / dferror | | From SPSS
Total | SStotal |
Key for Calculations
A = Factor A (Training setting)
B = Factor B (Age)
a = number of levels in Factor A (2)
b = number of levels in Factor B (2)
N = total number of values (participants) (12)
SPSS Output for 2-Way Between-Subjects ANOVA
The SPSS printout contains 5 sections:
Between-Subjects Factors
Descriptive statistics
Levene’s Test of Equality of Error Variances
Tests of Between-Subjects Effects
Profile plots (the graph of the interaction)
SPSS Output: Between-Subjects Factors & Descriptive Statistics
Between-Subjects Factors
This table displays the levels of the factors.
Descriptive Statistics
This table displays the means and SDs for:
The two factors separately.
The two factors interacted.
SPSS Output: Levene’s Test & Tests of Between-Subjects Effects
Levene’s test of Equality of Error Variances
Non-significant result (e.g., p = .611) shows that the variances within the groups are not significantly different from each other.
If the assumption of homogeneity of variance is met, you can proceed to interpret the ANOVA results.
Tests of Between-Subjects Effects
This table displays the ANOVA statistics needed for reporting the two main effects and interaction.
Reporting Main Effects
Main Effect of Training Setting
F(df{factor}, df{error}) = F value, p value
Example: F(1, 8) = 59.39, p < .001
Interpretation: There is a significant effect of training setting on 100-meter running time.
Main Effect of Age
F(df{factor}, df{error}) = F value, p value
Example: F(1, 8) = 30.01, p < .001
Interpretation: There is a significant effect of age on 100-meter running time.
Reporting Interactions
Interaction Training Setting * Age
F(df{interaction}, df{error}) = F value, p value
Example: F(1, 8) = 13.11, p = .007
Interpretation: There is a significant interaction between training setting and age on 100-meter running time.
Effect Size Calculation
The effect size for ANOVA is called eta squared, or η^2
η^2 = SS{effect} / SS{total}
SPSS calculates partial eta squared; for one-way ANOVA, this is the same as eta squared.
Cohen’s (1988) guidelines for η^2:
Small: 0.01
Medium: 0.059
Large: 0.138
For two-way ANOVA, we need to calculate the effect size/s by hand
Effect Size: Training setting
η^2 = SS{training setting} / SS{total (corrected)}
η^2 = 4.833 / 9.000 = 0.537
A large effect size, according to Cohen (1988).
Effect Size: Age
η^2 = SS{age} / SS{total (corrected)}
Effect Size: Training Setting * Age
η^2 = SS{interaction} / SS{total (corrected)}
Activity – Reporting Two-way ANOVA
Study with 2 factors: Caffeine (2 levels) and Drug (2 levels).
The DV is Alertness level.
Statistically report (with effect sizes) the following:
Main effect of caffeine.
Main effect of drug.
Interaction between caffeine and drug.
Profile Plots
Graph of the interaction, allowing you to visually inspect your data
What is an Interaction?
A significant interaction occurs when the effect of one factor differs according to the level of another factor.
Inspect the lines on the profile plot:
If the lines are parallel, it suggests that there is no significant interaction.
If the lines are not parallel, it suggests that there may be a significant interaction.
Non-Significant Interaction: Example
If the lines on the graph are parallel to each other, this indicates that there is a non-significant interaction.
If the effects of caffeine and task difficulty are independent, there will be no interaction between them.
Significant Interaction: Example
If the lines on the graph are not parallel to each other, this indicates that there is a significant interaction.
Caffeine and task interact
The effects of caffeine differ for hard and easy tasks.
Significant Interaction Example: Crossover
This type of interaction can be referred to as a crossover interaction- the means crossover one another in different situations.
Performance was better on the easy task without caffeine and better on the hard task with caffeine.
Interpreting and Reporting Factorial ANOVA
State the ANOVA type, effect of factor/s on DV
Present means and SDs in table
Mention the assumptions
Report the ANOVA results giving df, F-ratio & p-value
Report these for all main effects and interaction/s
Report the effect size and what this size means
Report these for all main effects and interaction/s
Reporting comparisons
Formally Reporting the Results: 1&2
A two-way between-subjects ANOVA was conducted to examine the effect of training setting and age on running speed in a 100-meter sprint.
The means and standard deviations for running times for training setting and age group are shown in Table 1.
Table 1: The mean (and standard deviation) running times for the training setting and age groups
Young | Old | |
|---|---|---|
Training Novel | 14.03 (0.45) | 12.40 (0.30) |
Setting Familiar | 12.00 (0.17) | 11.66 (0.25) |
Reporting the Results: 3, 4 & 5
Initial analyses were carried out to ensure no violation of the assumptions. Levene’s test for homogeneity of variance was non-significant, p = .611, suggesting this assumption had been met.
There was a significant main effect of training setting on running time, F(1, 8) = 59.39, p < .001, η^2= .537, a large effect size.
There was a significant main effect of age on running time, F(1, 8) = 30.01, p < .001, η^2= .272, a large effect size.
There was significant two-way interaction between training setting and age, F(1, 8) = 13.11, p = .007, η^2= .119, a medium-large effect size.
Reporting the Results
In today’s ANOVA, both of our factors have only 2 levels:
Training setting = 2 levels.
Age = 2 levels.
Therefore, we do not need to perform follow-up tests on the main effects because we can determine from the mean values alone which level of the factor is higher/ lower.
Reporting the Results: 6
Significantly slower running times were found for younger runners (M= 13.02, SD= 1.15) than older runners (M= 12.03, SD= 0.47).
Significantly slower running times were found for those training in a novel training setting (M= 13.22, SD= 0.96), compared to a familiar setting (M= 11.83, SD= 0.27).
Interpreting Factorial ANOVA
The ANOVA table will show which main effects and interaction terms are significant, but not how to interpret them:
Main effects = follow up with planned or unplanned comparisons for factors with more than 2 levels
Interactions = follow up with simple effects (using t-tests)
Following up Main Effects: Planned or Unplanned Comparisons
We use planned or unplanned comparisons when we have compared 3 or more means and we want to see which ones significantly differ from each other
Today’s examples was a 2x2 ANOVA: each of our factors only has two levels (novel vs familiar; young vs old)
If we had 3 or more levels we would conduct the comparisons in the same way as for a one-way ANOVA
Following up Interactions
In our example, there was a significant interaction between training setting and age, so we can explore this interaction
Look at the condition means plotted onto a line graph
Statistically examine the interaction using simple effects
Interpreting a Two-way Interaction Visually
Step 1: examine the graph to see where differences might be occurring
In novel and familiar training settings, older participants perform faster
In familiar settings, both old and young participants perform faster compared to novel training settings
Interpreting a Two-way Interaction using Simple Effects
Step 2: statistically examine the interaction using simple effects
We still do not know exactly which points on the interaction plot are significantly different, so we need to statistically examine them
You will not be tested on this in the exam but it’s good to know that it is possible for your final year projects
Simple Effects
t-tests run between combinations of two different levels of the factor/IVs
We have an independent measures ANOVA design: independent samples t- tests
In a 2 x 2 design we would perform 4 t-tests:
Novel young vs. Novel old
Familiar young vs. Familiar old
Young Familiar vs. Young Novel
Old Familiar vs. Old Novel
Bonferroni Adjustment or Correction to the p value
Performing multiple tests increases the likelihood of Type 1 error (Lecture 4)
We can control for this by using a Bonferroni adjustment/ correction
It divides our acceptable probability level (p<0.05) by the number of comparisons we wish to make. E.g.:
2 comparisons = 0.05 / 2 = 0.025 à we would then adopt the more stringent probability level of 0.025
6 comparisons = 0.05 / 6 = 0.008 à we would then adopt the more stringent probability level of 0.008
Simple Effects: what the results look like
Novel young vs. novel old
Familiar young vs. familiar old
Young novel vs. young familiar
Old novel vs. old familiar
Concluding and Interpreting the Results
The results show that training setting has an effect on running time, suggesting that running time is faster when training in a familiar setting, compared to a novel setting.
The results also showed that age has an effect on running time, with older runners (25-35 years) faster than younger runners (18-25 years).
Training setting and age also interacted.
Summary of Between-Subjects Factorial ANOVA Method
Run a factorial between-subjects ANOVA
Check assumptions
Check homogeneity of variance test
Interpret ANOVA table for main effect and interactions
Calculate effect size for main effect and interactions
Conduct any necessary follow up comparisons (if any factor has >2 levels)
Report results clearly and concisely
Dunn & Searle (2010) Findings
2 x 2 between-subjects ANOVA:
Car status (neutral/ high) x sex of rater (male/ female)
Main effect of car status
Main effect of sex
Car status x sex interaction
Females rate males more attractive when driving a high status car- females were affected by car status, but males were not.
Summary
Two-way between-subjects ANOVA
Two factors manipulated between-subjects
>1 factor: factorial ANOVA
Report:
The main effect of Factor 1
The main effect of Factor 2
The two-way interaction between Factor 1 and Factor 2
Non-parallel lines on an interaction graph indicate there may be a significant interaction
You can interpret a significant interaction visually and by using simple effects (t- tests: you won’t be examined on this).
Activity Questions
Study Description
Researchers were interested in the effect of types of words (negative and neutral) and anxiety (anxious or not anxious) on recall of words.
It was predicted that participants diagnosed with anxiety would have higher levels of recall of negative words, compared to participants without anxiety.
Poll Questions
How many factors are there?
What is the number of levels in each factor?
How many conditions are there?
Is there a significant main effect of word type?
Is there a significant main effect of anxiety group?
Is there a significant interaction between word type and anxiety group?
Is the interaction graph of the means consistent with the prediction made by the researcher?
Key Reading
Harrison, V., Kemp, R., Brace, N., & Snelgar, R. (2021). SPSS for Psychologists (and everybody else) (7th Ed.) pp. 193-200 (Chapter 9).
Coolican, H. (2019). Research Methods and Statistics in Psychology (7th Ed.). London: Psychology Press. pp. 653-665.
Note: please ignore the “Calculation of a two-way unrelated ANOVA” section on page 663
Additional Reading (Papers Cited in the Lecture)
Dunn, M.J., & Searle, R. (2010). Effect of manipulated prestige-car ownership on both sex attractiveness ratings. British Journal of Psychology, 101, 69-80. http://dx.doi.org/10.1348/000712609X417319
Lee, M., & Seo, M. (2018). Effect of direct and indirect contact with mental illness on dangerousness and social distance. International Journal of Social Psychiatry, 64(2), 112-119. doi:10.1177/0020764017748181