1/60
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
one-sided hypotheses
H0: µ1 = µ2 (there is no difference between the two groups)
H1: µ1 > µ2 or µ1 < µ2 (difference exists in a specific direction based on expectation or interest)
why does one-sided alternative hypothesis change
More powerful for detecting difference in a specific direction and ignores potential differences in the opposite direction
how is alternative hypothesis for one-sided tests chosen
when study suggest change in one direction
one-sided test assumptions
sample mean follows same direction of alt hypothesis (µ1 < µ2, then we expect y1 < y2)
critical value one-sided test
Significance level (a) is concentrated in one tail, increases the test's ability to detect a difference
categorical data
frequencies, how often each category occurs
null hypothesis of goodness of fit test
The observed frequencies match the expected frequencies
alternative hypothesis of goodness of fit test
at least one of the expected proportions is incorrect
how is chi-square different from two-sample test
chi-square uses categorical variables and two-sample uses continuous numerical data
assumptions of chi-square test
data is random, smallest expected value is >= 5
sample size and chi square test
Larger samples increase likelihood of detecting small, insignificant differences
Smaller samples may fail to meet assume of expected cell frequencies are greater than 5 = inaccurate p-values
X²* = 0
perfect fit between observed and expected frequencies, FTR, variables are completely independent
contingency table
comparison of two categorical factors, each with two or more levels
usage of contingency tables
comparing proportions and testing association/independence
comparing proportions hypotheses
H0: The population proportions are equal across groups (p1 = p2)
H1: The proportions differ (p1=/ p2, p1 > p2, p1 < p2)
testing independence hypotheses
H0: The variables are independent (there is no relationship or association)
H1: The variables are not independent (there is a relationship or association)
assumptions of contingency tables
data is random, all expected values are >= 5
parameter of X²*
degrees of freedom, as d.f increases, distribution becomes more spread out and less concentrated near 0
proportion
shows how much something is in relation to total
parameter for proportions and what estimates it
p = part/whole, estimated by sample proportion
X²*
measures the total difference between the observed and expected frequencies across all categories, large value = reject null
one-sided alternative conditions for chi-square
if d.f = 1
correlation
compares two continuous variables, answers direction and strength
sample correlation coefficient
describes direction and strength, estimates population correlation coefficient
positive correlation coefficient
As one variable tends to increase/decrease, the other variable also tends to increase/decrease - same direction
negative correlation coefficient
As one variable tends to increase, the other tends to decrease and vice versa – opposite direction
0 correlation coefficient
no linear relationship
strength of relationship
close to +1 or -1, strong, closer to 0, weak
correlation coefficient range
+1 to -1
null hypothesis correlation test
H0: p=0 (there is no relationship between X and Y)
alternative hypothesis correlation test
H1: p=/0 (there is a relationship between X and Y
IF one-sided: p> 0 (X and Y have a positive relationship or p < 0 (X and Y have a negative relationship)
assumptions of correlation test
random data, X and Y are linear, for one-sided (H1: p < 0 then r < 0)
scatterplots
show correlations on a graph
does order of variables matter
No, order of the formula gives the exact same value regardless of data point on x or y axis
Why is neither variable in correlation tests considered independent or dependent?
We are finding association, variables may effect one another or not
regression test
compares two continuous variables where one variable is used to predict the other
least squares line
relationship between two continuous variables represented using a straight line, provides a model to predict Y from X
center of least squares line
(xbar , ybar)
residuals
Distance between each observed value and its predicted value on the regression line, observed value - predicted value
b1
slope, estimates direction and rate of change
b0
y-intercept, reflects average relationship between x and y
SSr/sum of squared residuals
Small SSr: line fits well (overall error is low)
Large SSr: line fits poorly (overall error is high)
coefficient of determination
how much variation in the dependent variable is explained by the least squares line using the independent variable
high r² (close to 1): least squares line explains large portion of the variation in Y
low r² (close to 0): most of the variation in Y is unexplained
range of R²
0 to 1
significance of slope tested
helps determine whether the observed relationship is statistically meaningful rather than due to random chance
null hypothesis regression test
no relationship between X and Y, slope is equal to zero (B1 = 0)
alternative hypothesis
there is a relationship between X and Y, slope is not equal to 0
assumptions of regression test
data must be random, X and Y are linear, normal distribution, residuals are independent, variance of residuals are constant, if one-sided, B1 < 0 and b1 < 0
residual plots
show predicted values/residuals, X values on x-axis, residuals on y-axis, reference line at y = 0
normal residual plot
balanced below and above 0 line, no trend, if Q-Q plot is made, points would align closely
curved residual plots
U shape/inverted U, shows it is not linear
funnel shaped residual plot
widen or narrow in a funnel-like pattern (left to right), violates variance
when to use one-sample test
compare data to pre-set value
when to use two-sample test
compare two independent groups of data
when to use correlation test
used for associations
when to use regression test
to model/predict outcomes
when to use welchs test
independent groups, unequal variances, unequal sample sizes
when to use classic t-test
independent groups, equal variances, equal sample sizes
when to use paired t-test
dependent data
when to use mann whitney U test
when there are extreme outliers
when to use chi-squared test
when testing proportions or frequencies of categorical