1/28
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
What is the problem with running multiple tests (all possible pairwise comparisons)?
running multiple tests inflates the probability of getting at least one Type I error
• the more tests you run the more opportunity for a type 1 error
ANOVA
Analysis of variance (ANOVA) compares the means of multiple groups simultaneously in a single analysis
• null assumption that all groups have the same true mean is equivalent to saying that each group sample is drawn from the same population
• but each group sample is bound to have a different mean due to sampling error
• ANOVA determines if there is more variance among sample means then we would expect by sampling error alone
Hypotheses of an anova
H₀: µ₁ = µ₂ = µ₃ ...... µn (the means in all groups are equal)
Ha: mean of at least one group is different from at least one other group
ANOVA: two measures of variation
1. Group mean square (MSgroups)
2. Error mean square (MSerror)
Group mean square (MSgroups)
Is proportional to the observed amount of variance among group sample means
• variation among groups
Error mean square (MSerror)
Estimates the variance among subjects that belong to each group
• variation within groups
• how much spread is within the group?
- more spread = more sampling error
What kind of test statistic does an ANOVA give? and what do they mean?
Test statistic is a ratio
• true null: MSgroups / MSerror = 1
• false null: MSgroups / MSerror > 1
- this means there's more difference among groups than within the groups (so group designation does matter)
sums of squares
calculates two sources of variation among (SSgroups) and within groups (SSerror)
SSgroups equation
i = group
Yibar = mean group i
Ybar = grand mean (mean of all observations)

Grand mean
the mean of all the scores across the groups (AKA pooled mean, combined mean), to estimate total variability

SSerror equation
j = observations
si² = variation within group (sd²)

SStotal
SSgroups + SSerrror
partitioning some of squares (graphs for SStotal, SSgroups, SSerrror)

MSgroups equation
k = number of groups
dfgroups = k - 1

MSerror equation
N = total number of observations
k = number of groups
dferror = N - k

ANOVA test statistic
F-ratio

F statistic (df)
• F statistic has pair of degrees of freedom
- numerator and denominator
- ex: F₂,₁₉ (2 = dfgroup, 19 = dferror)

How to calculate a P-value using the F-distribution
• stats table
• computer

R^2
the total variation of scores from the grand mean that is accounted for by the variation between the means of the groups
• R² measures the fraction of variation in Y that is explained by group differences
• ex: R² = 0.43, 43% of the total variation can be explained by the variation between group means

Assumptions of ANOVA
1. measurements in every group represents a random sample from the corresponding population
2. variable is normally distributed in each of the k populations
- robust to deviations, particularly when sample size is large
3. Variance is the same in all k-populations
- robust to departures if sample sizes are large and balanced, and no more than 10x differences among groups
what happens if the ANOVA assumptions are not met?
1. test normality with Shapiro-Wilk and test equal variance with Levene's test
2. data Transformations can make data more normal and variances more equal → ONLY ln()
3. Nonparametric alternative
Nonparametric alternative to ANOVA
Kruskal-Wallis test
• similar Principle as Mann-Whitney you test
• looks at the distribution of ranks
Planned comparison
a comparison between means planned during the design of the study, identified before the data are examined
• like a two sample t-test (looks at difs in mean)
• before you run the study you know what you're going to compare
• in circadian clock follow-up study, the planned comparison was difference in means between knee and control group

Unplanned comparisons
• comparisons or unplanned if you test for differences among all means
• problem of multiple tests (increasing probability of Type I error should be accounted for)
• Tukey-Kramer method
Tukey-Kramer method
The probability of making at least one type one error throughout the course of testing all pairs of means is no greater than the significance level α
• P values get inflated a little bit → lowers the amount of Type I errors you get
•works like a series of two sample t-tests, but with a higher critical value to limit type 1 error rate
- because multiple tests are done, the adjustment makes it harder to reject the null

Hypotheses of a Tukey-Kramer
H₀: The mean of group i equals the mean of group j for all pairs of means j > i
Ha: the mean of group i does not equal the mean of group j for all pairs of means
Tukey-Kramer significant difference (letters on graph)
share a letter = no significance difference from each other

Kruskal-Wallis post-hoc test
• suppose that your data...
- fail normality even after transformation
- generate a significant Kruskal-Wallis result
• so the interpretation is that the distribution of ranks differs for at least one group but which one?
• should NOT use Tukey-Kramer, which is a parametric test
• should use Dunn's test
Dunn's test
• post-hoc test for significant Kruskal-Wallis result
• will compare all possible pairs of groups while controlling for multiple tests