What type of research design is appropriate for a factorial ANOVA analysis?
What kind of research question(s) led us to a factorial ANOVA analysis?
What is the difference between a one-way ANOVA and a factorial ANOVA?
How many DV(s) and IV(s) does a two-way ANOVA design have?
How many DV(s) and IV(s) does a four-way ANOVA design have?
What type of DV and IV are those?
How many levels / groups / categories does each IV have?
In a 2 × 4 factorial design:
In a 2 × 5 × 2 factorial design:
The purpose of factorial ANOVA is still to compare the means of DV between groups
Main effect of college year: Does the mean confidence score differ among the four college years (100, 105, 112.5, and 122.5)?
Main effect of gender: Is there a difference in the mean confidence score between males and females (i.e., 107.5 vs. 112.5)?
Interaction: Do the average confidence scores differ between males and females across freshmen (i.e., 100-100), sophomores (110-100), juniors (115-100), and seniors (125-120)?
Interaction: Is the difference in mean confidence scores between males and females consistent across freshmen, sophomores, juniors, and seniors?
The main effect of college year: Do students from different college years vary in their average confidence, regardless of gender?
The main effect of gender: Do female and male students differ in their average confidence, regardless of college year?
Factorial ANOVA still compares group means by calculating and comparing variances between and within the groups (like one-way ANOVA)
Between-group variances can be divided into three parts: between-group variances under factor A, between-group variances under factor B, moderating effect (interaction) between A and B
We’ll introduce some similar concepts (but with different names):
DV: One numerical DV – children’s number of injuries in three months
IVs / Factors: 1) costume type; 2) age
Design: 3 (costume type: Superman, Batman, Mickey) × 2 (age: 2-4 years; 5-8 years) factorial design
RQ: Do costume type and age influence the frequency of children getting injured over three months?
Main effect of the costume type
Main effect of age
Costume × Age interaction
The total variability in the DV (deviations of all observations from the grand mean) is represented by total sums of squares, SS_{Total}
SS{Total} = ∑(𝑥{ijk}−𝑥_{GM})^2
Total Sum of Squares
SSA = b*n ∑(𝑥A − 𝑥_{GM})^2
SS_A = 3 × 5 × ((11.2-9.3)^2+(7.4-9.3)^2) ≈ 108.3
SSB = a*n ∑(𝑥B − 𝑥_{GM})^2
SS_B = 2 × 5 × ((4.8-9.3)^2+(9.9-9.3)^2+(13.2-9.3)^2) ≈ 358.2
Between-group Sum of Squares
Sum Square Costume, Age, Costume × Age
The variability due to group differences (deviations of respective group means from the grand mean) – factor A group, factor B group, and AB interaction – are represented by three different between-group sums of squares, SSA, SSB, and SS_{AB}
SS{AB} = n ∑(𝑥{ijk} − 𝑥A − 𝑥B + 𝑥_{GM})^2
Or
SS{Cell} =n ∑(𝑥{ij} − 𝑥_{GM})^2
Then SS{AB} = SS{Cell} – SSA – SSB
SS_{Cell}= 5 × ((5.0-9.3)^2+ (12.4-9.3)^2 +(16.2-9.3)^2 + (4.6-9.3)^2 + (7.4-9.3)^2 + (10.2-9.3)^2) ≈ 511.1
SS_{AB} =511.1 – 108.3 – 358.2 ≈ 44.6
Between-group Sum of Squares Contd.
Sum Square Costume, Age, Costume × Age
The variability due to group differences (deviations of respective group means from the grand mean) – factor A group, factor B group, and AB interaction – are represented by three different between-group sums of squares, SSA, SSB, and SS_{AB}
SS{Error} = ∑(𝑥{ijk}−𝑥_{ij} )^2
SS_{Error}= (4-5)^2+(8-5)^2+… (9-12.4)^2+(15-12.4)^2+… (18-16.2)^2+(15-16.2)^2+… (4-4.6)^2+(7-4.6)^2+… (4-7.4)^2+(10-7.4)^2+…(12-10.2)^2+(9-10.2)^2+… ≈ 155.2
Within-group Sum of Squares
SUM SQUARE RESIDUAL (ERROR)
The variability due to within-group individual differences (deviations of obs. within a group from their corresponding group means) are represented by within-group sums of squares, SS{Within-group} (or SS{Error})
SS{Total} = SS{Model/Between-group} +SS_{Error}
SS{Total} = SSA + SSB + SS{AB} +SS{Error} = SS{Model} + SS_{Error}
666.3 = 108.3 + 358.2 + 44.6 + 155.2
Variance Partitioning Summary
IN THE CASE OF A TWO-WAY MODEL
A two-way factorial ANOVA model equation can be written as:
χ_{ijk} = individual observation
µ = grand population mean
ε_{ijk} = the error term, the extent to which individual observations within a population differ from each other
3 sources of variation between groups (New!)
α_i = the effect of factor A (or the degree that a particular level mean in A differs from the population mean)
β_j = the effect of factor B (or the degree that a particular level mean in B differs from the population mean)
αβ_{ij} = the effect of interaction (any “leftover” variation between a particular cell mean and the grand population mean, once the effect of factor A and factor B have been accounted for (subtracted)).
Factorial ANOVA Model Equation
IN THE CASE OF A TWO-WAY MODEL
In the degree of freedom (DF) column:
SS{Model} = SSA + SSB + SS{AB}
SS{Total} = SSA + SSB + SS{AB} +SS{Error} = SS{Model} + SS_{Error}
Signal-to-noise ratio: When H0 is true (means of groups do not differ), F-ratio ≈ 1; When H0 is not true (group means are different), F-ratio becomes bigger.
Cohen’s (1988) effect size rule-of-thumb: small (0.01); medium (0.06); large (0.14)
More often used in factorial ANOVA
In the degree of freedom (DF) column:
SS{Model} = SSA + SSB + SS{AB}
SS{Total} = SSA + SSB + SS{AB} +SS{Error} = SS{Model} + SS_{Error}
MS = mean square (sums of squares divided by degrees of freedom)
For the rest of the statistical analyses in this unit (regression, ANOVA, non-parametric analyses), we’ll follow a standard process:
Before getting into the data, we must understand (design steps):
1. Our research questions and hypotheses we are trying to answer with our data
2. Our sampling population
3. How our variables measured (type and scale)
Then, getting into the data analysis, we then (statistics steps):
4. Describe variables using appropriate UNIVARIATE numeric and graphical summaries
5. Describe variables using appropriate BIVARIATE numeric and graphical summaries
6. Formally test assumptions
7. Fit appropriate statistical model(s)
8. Interpret results + draw conclusions
egen cos_age = group(cos age), label
tabstat freq, by(cos_age) stat(n mean sd skewness kurtosis)
5. Describe variables using appropriate BIVARIATE numeric and graphical summaries
5. Describe variables using appropriate BIVARIATE numeric and graphical summaries
tab age cos, summarize (freq)
Marginal means of age group. Marginal means of costume type cell means
Margins commands must follow the ANOVA command!
Note: The IV for the x-axis should come first! Reversing the order of IVs in command generates an alternate plot -> see next slide
LINEAR GRAPH
Only need to use one plot!
Typically, put factor(s) with more levels on the x-axis!
LINEAR GRAPH
BAR GRAPH
5. Describe variables using appropriate BIVARIATE numeric and graphical summaries
net install cibar.pkg
cibar freq, over(age cos)
histogram freq, by (cos age)
Shapiro–Wilk W test for normal data
6. Formally test assumptions
NORMALITY
In line with the descriptive stats and histograms
Summary of freq
6. Formally test assumptions
HOMOGENEITY OF VARIANCE
A 2 × 3 between-subjects ANOVA was conducted to examine the effects of age (2-4-year-old, 5-8-year- old) and costume types (Mickey, Superman, Batman) on the frequency of children getting injured over three months. Results indicated a significant main effect of the kind of costume children wore, F(2, 24) = 27.70, p <.001, ηp 2 = .70, and a significant main effect of age, F(1, 24) = 16.75, p < .001, ηp 2 = .41. However, the main effects were qualified by a significant interaction between age and costume type, F(2, 24) = 3.45, p = .048, ηp 2 = .22. Follow-up analyses indicated that …….
Factorial ANOVA is an extension of one-way ANOVA, but we are dealing with more than one IV (factor), and all the levels of each IV (factor) are fully crossed with all levels of other IVs (factors)
These factors can affect the DV independently or in a combined manner (moderating effect)
Main effects – What is the effect of a factor across all levels of another factor(s)? How do one factor's level means differ, ignoring all other factors?
Interaction effects – Is the effect of one factor the same or different at various levels of another factor(s)? Are the differences between all levels of one factor the same or different at each level of another factor(s)?
Factorial ANOVA still compares group means via comparing between- vs. within-group variances; however, the between-group variance can be further partitioned into factor(s) variances and the interaction variance (more F-values than the one-way ANOVA!)
As an extension of one-way ANOVA, factorial ANOVA has similar assumptions as one-way ANOVA
We will continue with the follow-up analyses in Week 8!
After this week’s lecture, you know: