1/8
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Type I and Type II errors
A small p-value indicates either
Rejecting this is a good call (null hypothesis being true was a bad assumption
But this could also be due to chance. Rejecting this is an error
Type 1 error: Incorrectly reject H0 when, in fact, it is true
“False positive” result
We never know whether the null hypothesis is really true or false
α = P(Type 1 error). Our p-value has to be less than α for us to reject the null
When the null is true, the p-value will fall below α exactly with probability 0.05 (if α = 5%)
If we select α = 0.05 and the null is actually true, there is a 5% chance that we make a mistake (reject the null)
We control the probability of committing a Type I error by carefully selecting α
A large p-value indicates either
Failing to reject is a good call (null hypothesis being true was a good assumption)
But this could also be due to chance. Failing to reject this is an error
If fail to reject H0, then either
Correct decision because the null hypothesis is true
Type II error (the null hypothesis is false, but we fail to reject it)
β represents the probability of a Type II error and is defined as β = P(Type II error) = P(failed to reject H0 even though H0 is false)
Cannot directly choose β to be small (e.g., 0.05) to control the probability of committing a Type II error because β depends on several factors, including
The sample size n
The level of significance (α)
the research hypothesis
If we increase α, say from 0.05 to 0.1, we are making it easier to reject the null
That means, we are more likely to get Type I errors too
We are less likely to make Type II errors
Because power = 1 - β, we increase our power
To reduce both types of errors, increase the sample size
Power and tradeoff between α and β
(Statistical) power = 1 - β
Statistical Power = Probability that the test correctly rejects a false null hypothesis = P(reject H0 when H0 is false)
From now on, we must recognize that when we do not reject H0, it may be very likely that we are committing a Type II error
Type II errors happen very often with small sample sizes
For a given sample size, if α goes up, α goes down and vice versa
Tradeoff between α and β
Only way to obtain fewer both types of errors is to conduct the experiment/collect data with large sample size
Connection between CIs and hypothesis test
Statistical hypothesis testing answers this question: Is there significant (statistical) evidence of a difference? OR, how likely is this difference is not chance?
What a conclusion that a result is statistically significant does NOT mean:
Does NOT mean the difference is large enough to be interesting
Does NOT mean the results are intriguing enough to be worthy of further investigations
Does NOT mean that the finding is scientifically or clinically significant
Confidence intervals can yield the size of an effect, i.e., can help think about clinical significance
Hypothesis tests can be equivalent results to a confidence interval:
If a 95% CI does not contain the value of the null hypothesis, then the result must be statistically significant, with p < 0.05
If a 95% CI does contain the value of hypothesis, then the result must not be statistically significant, with p > 0.05
Applies to other confidence levels as well
If the 99% Cl does not contain the null hypothesis value, then the p value must be less than 0.01
If the 90% CI does not contain the null hypothesis value, then the p value must be less than 0.1
X% CI will yield similar results as (100-X)% alpha-based hypothesis test
Assumptions needed for:
One-sample test for mean
Random sample
Sample size condition
Independent samples
Continuous variable approximately normally distributed
One-sample test for proportion
Random sample
Sample size condition
Independent samples
Binary variable
Nonparametric Tests
Parametric tests involve specific probability distributions (e.g., the normal distribution and the t distribution) and the tests require estimation of parameters of the distribution (e.g., the mean) from the sample data
Nonparametric tests are based on fewer assumptions and do not assume a particular form of distribution (e.g., normal) for the population distribution
Appropriate when the approximate normal assumption of the parametric test fails
Comments
If data are approximately normally distributed, then parametric tests are more appropriate
We can still use non-parametric tests but parametric tests are generally more powerful
The techniques we describe here apply to outcomes that are
Ordinal
Ranked
Continuous and are not normally distributed
Ranking Data
Similar to how median users ranks (order rather than the balues themselves), nonparametric tests use ranks
Rank data
Perform analysis on ranks rather than on original data
Follow same process for hypothesis testing
Nonparametric test for one sample test for mean
Wilcoxon test for one sample
One sample of independent continuous/ordinal observations
H0: Location parameter of distribution is null value
H1: Location parameter of distribution is NOT null value
Some investigators interpret this test as comparing the median between the population of interest and some null value
Analogous to one sample t-test (or z-test) for means (H0: one population mean is some null value)
H0: location = somevalue
H1: location ≠ somevalue
Parametric tests vs non-parametric tests: Tradeoffs
The cost of fewer assumptions is that nonparametric tests are generally less powerful than their parametric counterparts
Power = 1 - β. Probability that the test correctly rejects a false null hypothesis
When the null is false, nonparametric tests may be les likely to reject H0 than parametric tests if parametric tests can be used