1/64
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
how do you decide to do a one sided test
problem tells you key info about hypothesis
med to reduce blood pressure
use your own prior knowledge
what has more power 1 or 2 sided tests
1 sided
why can you not say that since ybar1>ybar1 then Hi: u1>u2
you should decide on the null hypothesis THEN look at y bars
How does our alternative hypothesis change for a one sided test
it is now directional
greater/less than
What extra step do you need to do before calculating your test statistic in one sided tests?
Make sure your data agree with your alternative hypothesis!
How do you pick the table value for a one sided test
use the one sided bar on the table instead of the 2 sided indicators
What are your conclusions for a one sided test
same as for a 2 sided
t*>t table = reject
U* > table = rejct
p < alpha = reject
Why does a one sided test have more power?
a one-sided test concentrates all of its statistical power on detecting effects in that specific direction
instead of splitting the tails it is all in one
What is categorical data?
can be divided into distinct categories that do not have a natural order or ranking
direct counts
colors
χ 2 goodness of fit test.
What is your null hypothesis?
Your alternative hypothesis?
Ho: Pr(x) = %
Hi: Pr (x) is not % or one of the probabilities in the Ho is wrong
How do you calculate χ 2*?
sum ((obs-exp)²/exp)
How do you calculate your expected values for a x²?
Total * % given
(rt*ct ) / gt
What are your degrees of freedom in x²
n-1
What is your comparison for x²
chi square table
What is your decision for x²
X²* > table = reject
When can you use a one-sided alternative for x²
where there are only 2 samples
What are the assumptions of a χ 2 goodness of fit test?
data is random
smallest expected value must be greater than or equal to 5
How does sample size help with the “smallest expected value must be greater than or equal to 5” assumption of the x²
larger sample sizes make this assumption easier to achieve
you need to have a large enough sample that is assumption is not violated
Ho for contingency tables
Comparing proportions OR Determining independence
Ho: p1 = p2
or
Ho: 1 and 2 are independent
How do you calculate χ²* for contingency tables
sum ((obs-exp)²/exp)
degrees of freedom for χ²* for contingency tables
(row -1) (column -1)
How do you calculate your expected values for χ²* for contingency tables
( row total * column total ) / grand total
If you reject, know how to figure out which direction things are going in (particularly for tests of independence)
if you reject the Ho: a/b are independent it would mean that a/b are dependent
but does that mean A likes or dislikes B
do conditional probabilities to find this out
prob A + B present / total
prob only A present / total
if prob of only A is higher then they dislike eachother
what is a “r x k” tables
contingency table that is more than 2×2
relative risk = RR hat
p hat 1 / p hat 2
for a contingency table what size table will always result in a degrees of freedom =1
2×2 table
x² contingency test assumptions
random
smallest expected data is greater or equal to 5
why do we use a correlation
see if there is a relationship between x and y
What do we mean by comparing a continuous variable with a continuous variable?
numerical data vs numerical data
height vs weight
continuous data
numerical variable that can take on any value within a given range
What is r, and what does it estimate?
r= pearsons correlation coefficent
estimates p (true unkown correlation coefficent)
has a value of -1 ≤ r ≤ 1
what does a r= 1 mean
what does a r=0 mean
r equal to (+) or (-) 1 means there is a good correlation
r equal to 0 means there is no correlation
What determines if r is positive? negative?
r is positive when there's a tendency for both variables to increase or decrease together
r is negative when one variable tends to increase as the other decreases.
What is the range for r?
-1 ≤ r ≤ 1
What is SScp?
what does it tell you
formula
tells you if the data goes up or down
used to represent variance
sum(xi-xbar)(yi-ybar)
If we're doing a correlation hypothesis test, what is our H0? our H1?
Ho: p = 0
H1: p ≠ 0 or p > 0
p = true correlation coefficient estimated by r
What is our t * for correlation
r * sqrt ( (n-2) / (1 - r²) )
degrees of freedom in correlation test
n-2
Why does a significant correlation not imply that one variable causes another?
correlation simply indicates a relationship exists, but doesn't prove which variable, if either, is influencing the other
there can be hidden variables
correlation assumptions
random data
relationship is approx straight/linear
x/y are both approx normal
how do you check that relationship is approx straight/linear and x/y are both approx normal for the correlation test
scatter plots
how is regression different from correlation?
Correlation measures the strength and direction of the relationship between two variables
regression analyzes how one variable affects another
Beta 1
what is it
formula
slope
SScp / SSx
Beta 0
what is it
formula
intercept
y bar - beta1(xbar)
What is a least squares lines, and how do we fit one to our data?
the straight line that best represents a set of data points in a scatter plot
etermined by minimizing the sum of the squared differences between the actual data points and the corresponding points on the line
What is your equation for the least squares lines
y hat = beta0 + beta1 x
error or residual
the difference between the observed value and the predicted value from a regression model.
distance from the line
account for variability not represented by linear relationship
yi= beta0 + beta1 x + ei
What is the difference between Yhati and Yi ?
yhat 1 represents the predicted value of the dependent variable for the ith observation,
yi represents the actual observed value of the dependent variable for the ith observation.
What are we estimating with b0 and b1?
b0 = true unk population intercept
b1= true unk pop slope
How do we figure out if our slope is significant in a reegression
hypothesis test
test statistic for regression
b1 / (S residuals / SSx)
What is our H0? our H1? for regression
Ho: B1 = 0
hi: B1 is not 0 or b1 >0
What is SSresid?
sum ( yi - yhat) ²
represents the unexplained variation in a dataset
What is SEb1?
sqrt (SS residuals / n-1) / SSx
standard error of the slope coefficient
What does Y^ represent
the expected value from the least of squares line
regression assumptions
how do you check these assumptions
random data
relationship between x and y must be straight
look at residual plot
residuals of each level of x are normally distributed
QQ plot of residuals
variance of the residuals is constant
residual plot
each residual is independent of every other residual
residual plot
how do you make a residual plot
graph y-yhat
what does it mean if your residuals show you have a curve?
violates linear assumption of regression
what should you see in a good residual plot
totally random data in a scatter plot
How do residual plots help you in assessing linearity and constant variance?
residual plots highlight problems in the data
easier to see curves / funnels
How do q-q plots help you in validation the regression assumptions
comparing data distributions across different datasets for normality
What is R² ?
measure how how good x is at explaining y
R² formula
R² = (r= pearsons correlation coefficient)²
{ SScp / sqrt (SSx * SSy) } ²
or SS regression / SSy
outline for doing regressions
calc least of squares
hypothesis
check assumptions
if everything is ok → do hypothesis test
is R² significant?
what do you do if you data does not agree with you alternative hypothesis
you automatically Fail to reject Ho