1/41
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
TUNA TEA
1. think up a research question
2. state the null hypothesis
3. state the alternative hypothesis
4. test
- test name
- test statistic formula
- test statistic value
5. evaluate extremeness
- null distribution
- p-value or critical value
- decision
- state conclusion in English
6. assess the next step
asks how unusual it is to get the data we got when the null hypothesis is true
hypothesis testing
asks how unusual it is to get the data we got when the null hypothesis is true
null hypothesis (H0)
a specific statement about a population parameter made for the purposes of argument, usually the simplest statement
alternative hypothesis
represents all other possible parameter values except that stated in the null hypothesis, usually the statement of greatest interest
null distribution
the probability distribution of possible values of the test statistic when the null hypothesis is true, chi-squared with df = _ OR binomial with N = __ and p = __
P-value
the probability of getting the given result, or something as extreme or more extreme, when the null hypothesis is true
significance level (alpha)
a probability used as a criterion for rejecting the null hypothesis; if the P-value for a test is less than or equal to alpha, then we reject the null hypothesis
test of a single proportion
TU: a population proportion equals a specific number?
H0: p = #
HA: p != # (or p > # or p < #)
T: test of a single proportion, find the number of successes, X
E: binomial distribution is null distribution
type I error
rejecting a true null hypothesis, false positive, probability of type I error is alpha
type II error
not rejecting a false null hypothesis, false negative
power
the ability of a test to reject a false null hypothesis, the probability of a correct decision when the null hypothesis is false. larger sample gives more power to reject a false null hypothesis
p
the population parameter for the true proportion of individuals in the "success" category
p hat
sample proportion, X/N = # subjects with attribute / total subjects
binomial distribution
describes the probability of a given number of "successes" from a fixed number of independent trials each with the same probability of "success" -- the null distribution for a test of a single proportion
binomial distribution assumptions
-the number of trials (N) is fixed
-individual trials are independent
-the probability of success (p) is the same for every trial
binomial probability formula
P(x)= (nCx) (px) (1-p)n-x
x = number of "successes"
n = number of trials
p = probability of success
confidence interval
(sample estimate +/- 2*SE of sample estimate)
Agresti-Coull method
4th formula on the sheet
duality of confidence intervals and hypothesis testing
reject null, HA favored = CI does not include hypothesized value
do not reject null, H0 favored = CI includes hypothesized value
3 types of statistical inference
1. interval estimate (confidence interval)
2. point estimate (phat = X/N)
3. hypothesis test (H0: p=#, HA: p!=#)
test of association
are two categorical variables independent or not? shown on mosaic plot
independent ("no association")
two variables are independent if the probability of a particular outcome of one variable is about the same for both levels of the other variable
relative risk
ratio of probabilities for the event of two groups
pr(worse outcome in group 1)/pr(worse outcome in group 2)
The probability of a (focus event) is (RR) times greater for the (numerator group) vs. the (denominator group).
odds ratio
ratio of odds of the event for two groups
OR = O1/O2 = odds(event in group 1) / odds(event in group 2) = AD/BC
The odds of (event) are (OR) times greater for r for the (numerator group) vs. the (denominator group).
test statistic (X2)
sum (observed - expected)2/expected or just X
contingency analysis
estimates and tests for an association between two or more categorical variables
residuals
observed - expected
sample size requirements for Chi-squared test
- all cells must have expected counts greater than or equal to 1
- at least 80% of cells must have expected counts greater than or equal to 5
options to evaluate extremeness
compare P-value to alpha OR compare test statistic value to critical value.
P-value < alpha, reject null
X² > critical value, reject null
finding P-value in R
1 - pchisq(test stat, df)
degrees of freedom for test of association
df = (columns-1)(rows-1)
critical value
the value of the distribution at a set alpha value; the value beyond which the probability of such a value or greater is less than the set alpha level
finding critical value in R
qchisq(1-alpha, df)
Agresti-Caffo interval
95% CI for true difference of two proportions (p1-p2)
big difference = large association
uniform distribution
probability model in which the probability of each outcome category is the same, p1=p2=p3=…
proportional distribution
model comes from a large entity, the data should match the number of opportunities
goodness-of-fit test question
do the collected categorical data fit a hypothesized probability distribution? compares observed counts to expected counts according to a specified probability distribution
H0: the data have a ___ distribution (uniform, proportional, or poisson)
discrete distribution
probability distribution describing a discrete random variable
degrees of freedom for goodness-of-fit
df = # of categories - 1 (uniform, proportional)
df = # of categories - 2) (poisson)
poisson distribution
describes the probability that a certain number of events occur in a block of time or space, when those events happen independently of each other and occur with equal probability at every point in time or space. No upper bound
clumped events
lots of low and high counts, presence of one individual attracts others
dispersed events
lots of middle counts, presence of one individual repels others