Looks like no one added any tags here yet for you.
Analysis of Variance
what does ANOVA stand for?
Analysis of Variance (ANOVA)
a test used to determine differences between research results from three or more unrelated samples or groups.
Analysis of Variance (ANOVA)
serves the same purpose as the t tests: it tests for differences in group means
Analysis of Variance (ANOVA)
more flexible in that it can handle any number of groups, unlike t tests, which are limited to two groups (independent samples) or two time points (dependent samples).
Systematic Variance
best understood as the variation arising from the differences between the independent variables
Unsystematic Variance
variability within individuals and/or groups of individuals
Unsystematic Variance
essentially random; some individuals change in one direction, others in an opposite direction, and some do not change at all
Random Error
a chance difference between the observed and true values of something.
ANOVA
all about looking at the different sources of variability (i.e. the reasons that scores differ from one another) in a dataset.
Grouping Variable
the predictor or in experimental terms, the independent variable, and is made up of k groups, with k being any whole number 2 or greater.
Outcome Variable
the variable on which people differ, and we try to explain or account for those differences based on group membership.
ANOVA
it requires two or more groups to work and is usually conducted with three or more.
Individual Group Means
the means of the groups in ANOVA, usually represented with subscripts
Grand Mean
the single mean representing the average of all participants across all groups, represented with MG.
Individual Group Means and Overall Grand Mean
how we calculate our sums of squares
Sums of Squares
used to calculate the sources of variability
Between-Group Variation
refers to the differences between the groups; for example in sampling, we are talking about the deviation between different samples drawn from different locations of a consignment.
Within-Group Variation
refers to variations caused by differences within individual groups (or levels). In other words, not all the values within each group (e.g. means) are the same.
Total Sums of Squares
An important feature of the sums of squares in ANOVA is that they all fit together. We could work through the algebra to demonstrate that if we added together the formulas for SSB and SSW, we would end up with the formula for this
Source
the first column of the ANOVA table, indicates which of our sources of variability we are using: between groups (B), within groups (W), or total (T).
SS
the second column in ANOVA table, contains our values for the sum of squared deviations, also known as the sum of squares
Sum of Squared Deviations
other term for sum of squares
degrees of freedom
df meaning
different
There is a ____ df for each group
N in df
refers to the overall sample size, not a specific group sample size
Mean Squared Deviation
MS stands for
Mean Square
another way to say variability and is calculated by dividing the sum of squares by its corresponding degrees of freedom.
F
last column in ANOVA table, our test statistic for ANOVA
F statistic
compared to a critical value to see whether we can reject for fail to reject a null hypothesis.
Type I Error
a false positive
Type I Error
the chance of committing this error is equal to our significance level, α.
Type I Error = significance level
This is true if we are only running a single analysis (such as a t test with only two groups) on a single data set
Type I Error
increases when we start running multiple analyses on the same dataset
Increased Type I Error rate
raises the probability that we are capitalizing on random chance and rejecting a null hypothesis when we should not.
ANOVA
keeps our error rate at the α we set
Null Hypothesis
still the idea of “no difference” in our data. Because we have multiple group means, we simply list them out as equal to each other
At least one mean is different.
alternative hypothesis for ANOVA
none
mathematical statement of the alternative hypothesis in ANOVA
alternative hypothesis in ANOVA
there is no directional hypothesis
Between
numerator df in the anova table
Within
denominator df in the anova table
Effect Size
In ANOVA, it is the ratio of these two sums of squares
eta-squared
The effect size η 2 is called _____
effect size
represents variance explained
.01
small effect size
.09
medium effect size
post hoc test
used after ANOVA to find which means are different
Reject Ho
Fobt is larger than Fcrit
Fail to reject Ho
Fobt is smaller than Fcrit
post hoc test
used only after we find a statistically significant result and need to determine where our differences truly came from.
after the event
translation of latin term “post hoc”
bonferroni test
perhaps the simplest post hoc analysis; a series of t tests performed on each pair of groups.
bonferroni correction
To avoid the inflation of Type I error rates, it divides our significance level α by the number of comparisons we are making so that when they are all run, they sum back up to our original Type I error rate.
Once we have our new significance level, we simply run independent samples t tests to look for differences between our pairs of groups.
Tukey’s Honestly Significant Difference
a popular post hoc analysis that, like Bonferroni’s, makes adjustments based on the number of comparisons; however, it makes adjustments to the test statistic when running the comparisons of two groups.
Tukey’s Honestly Significant DIfference
gives us an estimate of the difference between the groups and a confidence interval for the estimate.
Tukey’s Honestly Significant Difference
containing 0.00 means the groups are not different
Scheffe Test
adjusts the test statistic for how many comparisons are made, but it does so in a slightly different way
Scheffe Test
The result is a test that is “conservative,” which means that it is less likely to commit a Type I error, but this comes at the cost of less power to detect effects.
no difference
post hoc test result contain zero
with difference
post hoc test result do not contain zero
Factorial ANOVA
uses multiple grouping variables, not just one, to look for group mean differences.
Factorial ANOVA
there is no limit to the number of grouping variables, but it becomes very difficult to find and interpret significant results with many factors, so usually they are limited to two or three grouping variables with only a small number of groups in each.
Repeated Measures ANOVA
an extension of a related samples t test, but in this case we are measuring each person three or more times to look for a change.
Repeated Measures ANOVA
We can combine both of these ANOVAs into mixed designs to test very specific and valuable questions
Correlation
a statistical measure that expresses the extent to which two variables are linearly related
Correlation
they change together at a constant rate
Correlation
a common tool for describing simple relationships without making a statement about cause and effect
correlation coefficient
unit-free measure used to describe correlations
correlation coefficient
ranges from -1 to +1, denoted by r. Statistical significance is indicated with a p-value
Form, Direction, Magnitude
three characteristics of correlation
Form
the shape of the relationship in a scatter plot, and a scatter plot is the only way it is possible to assess it
Linear Relationship
a statistical term used to describe a straight-line relationship between two variables.
Linear Relationship
the form that will always be assumed when calculating correlations.
Curvilinear Relationship
a type of relationship between two variables where as one variable increases, so does the other variable, but only up to a certain point, after which, as one variable continues to increase, the other decreases.
Curvilinear Relationship
A form in which a line through the middle of the points in a scatter plot will be curved rather than straight.
Curvilinear Relationship
This is important to keep in mind, because the math behind our calculations of correlation coefficients will only ever produce a straight line—we cannot create a curved line with the techniques used in correlations.
No Relationship
indicates that there is no relationship between the two variables.
No Relationship
This form shows no consistency in relationship
Direction
tells whether the variables change in the same way at the same time or in opposite ways at the same time.
Positive Relationship
variables X and Y change in the same direction: as X goes up, Y goes up, and as X goes down, Y also goes down and the slope of the line moves from bottom left to top right.
Negative Relationship
variables X and Y change together in opposite directions: as X goes up, Y goes down, and vice versa, and the slope of the line moves from top left to bottom right.
No Relationship
represented by the number 0 as its correlation coefficient, and its line has no slope, which means that it is flat
Magnitude
the number being calculated as the correlation coefficient. It shows how strong or how consistent the relationship between the variables is.
greater magnitude
higher numbers mean ____
stronger relationship
higher numbers mean greater magnitudes, which means a ______
magnitude
the only thing that matters is the magnitude, or the absolute value of the correlation coefficient
very weak correlation
0-0.19
weak correlation
0.2-0.39
moderate correlation
0.4-0.59
strong correlation
0.6-0.79
very strong correlation
0.8-1.0
Pearson’s r
the most popular correlation coefficient for assessing linear relationships, which serves as both a descriptive statistic (like M) and a test statistic (like t).
Pearson’s r
It is descriptive because it describes what is happening in the scatter plot; r will have both a sign (+/−) for the direction and a number (0 to 1 in absolute value) for the magnitude
Pearson’s r
The coefficient r also works as a test statistic because the magnitude of r will correspond directly to a t value as the specific degrees of freedom, which can then be compared to a critical value.
test statistic
the coefficient r also works as a ______
Covariance
a measure of the relationship between two random variables and to what extent, they change together
formula for r
the covariance divided by the standard deviations of X and Y
rho
our population parameter for the correlation that we estimate with r, just like M and m for means.
N-2
df for correlations
one-tailed test
used when expecting only a positive relationship