1/49
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
ANOVA
analysis of variance; compares means between three or more groups
F value (ANOVA)
Mean Square of X / Mean Square of Error
Mean Squares
sum of squares divided by degrees of freedom
Degrees of freedom in ANOVA table
group= g-1
error= n-g
total= n-1
ANOVA assumptions and how to check
1. random samples- look at study
2. normal distribution for measurements within groups- histogram or qq plot
3. variability in groups is about the same (10%)- Levene's test
Steps for an ANOVA test
1) Set-up and Assumptions
2) Complete ANOVA table
3) Statistical conclusion (Reject or not reject)
4) Plain English conclusion
Sum of squares between groups
sum of size of group times (group mean-overall mean)
Sum of squares total
the sum of squared differences between each individual score and the grand mean of all scores
F distribution
The distribution that models the ratio of two variance estimates; used in ANOVA for obtaining the P-value for testing equality of three or more means
r code for p value (F)
1- pf(test stat, df1, df2)
r code for F statistic
qf(1-alpha, df1, df2)
r code for p value (slope)
2(1- pt(test stat, df))
r code for test stat (slope)
qt(1- alpha, df)
What to do if null is rejected (ANOVA)?
Figure out which means differ
- Bonferroni Connection
- Tukey Kramer
Bonferroni correction
suggests that a more stringent significance level is more appropriate for these tests: 𝛼∗ = 𝛼/𝐾where 𝐾 is the number of comparisons being considered.
𝐾= g(g-1)/2
After figuring out new alpha, then do multiple two-sample t tests
Planned comparison
A comparison of differences across levels of an independent variable when the researcher decides during the design of the study to make the comparison rather than waiting until after preliminary data analysis.
Use t dist after ANOVA
unplanned comparisons
Comparison between means that is not directed by your hypothesis and is made after finding statistical significance with an overall statistical test (such as ANOVA).
Use Tukey Kramer after ANOVA
Standard error for planned comparisons
The square root of MSE1/n1 + MSE2/n2
Planned comparison 95% confidence interval
Y2-Y1 +/- (t0.05(2), N-K)(SE)
r
Correlation coefficient; describes the linear relationship between two variables
Correlation assumptions
1. randomly sampled
2. bivariate normal distribution
- linear relationship between X and Y
- cloud of points in scatterplot of X and Y has circular or elliptical shape
- frequency distributions of X and Y are normal
Common deviations from a bivariate normal distribution
funnel, outlier, non linear (can see these in scatterplots)
Spearman's rank correlation
measures the strength and direction of the linear association between the ranks of two variables
residual plot
a scatterplot of the regression residuals against the explanatory variable
residuals
the difference between an observed value of the response variable and the value predicted by the regression line
Interpretation of slope
For every 1 change in x, the predicted y is associated with an increase or decrease on average by the slope.
Interpretation of the intercept
when x = 0, y is expected to equal the intercept
Conditions for least-squares line and how to check
1. random sample- look at study
2. linearity between x and y- look at scatterplot- elliptical cloud/residual plot shows no curved pattern
3. normally distributed residuals- qq plot of residuals/histogram of residuals
4. constant variance- scatterplot- no funnel/hourglass shape
Homoscedasticity
A regression in which the variances in y for the values of x are equal or close to equal
R^2
the proportion (percent) of the variation in the values of y that can be accounted for by the least squares regression line
Interpretation of R^2
Approximately r^2% of the variability in y can be explained by x
confounding variable
a factor other than the independent variable that might produce an effect in an experiment
prospective study
identifies individuals and collects information as events unfold
retrospective study
collect data after events have taken place
Ways to eliminate bias in experiments
- Controls
- Random assignment to treatments
- Blinding
- Random sample
Ways to reduce sampling error
replication, balance, blocking
Randomization
a process of randomly assigning subjects to different treatment groups
Blinding
a technique where the subjects do not know whether they are receiving a treatment or a placebo
Replication
Carry out a study on multiple independent objects
Balance
Nearly equal sample sizes in each treatment
Blocking
grouping of experimental units that have similar properties; within each group, different experimental treatments are applied to different units
When to use test of single proportion?
one categorical variable with two categories
equal to a certain proportion/percent
When to use goodness of fit test?
one categorical variable with more than two categories
each category has certain percent or equal proportions
When to use CI for single mean
one numerical variable
how does a value differ from a mean
When to use test of association?
Two categorical variables
Does one categorical variable influence the other?
When to use linear regression?
Two numerical variables
Does one numerical variable influence the other?
When to use Levene's Test?
Comparing variances
When to use paired t-test?
Comparing two means that aren't independent
When to use two-sample t-test
Comparing two means that are independent
When to use ANOVA?
Comparing three or more means