Lecture 10: Hypothesis Tests 2: Type I and II Errors, One Sample Test

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/8

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

New cards

Type I and Type II errors

A small p-value indicates either
1. Rejecting this is a good call (null hypothesis being true was a bad assumption
2. But this could also be due to chance. Rejecting this is an error
3. Type 1 error: Incorrectly reject H₀ when, in fact, it is true
  1. “False positive” result
  2. We never know whether the null hypothesis is really true or false
α = P(Type 1 error). Our p-value has to be less than α for us to reject the null
1. When the null is true, the p-value will fall below α exactly with probability 0.05 (if α = 5%)
  1. If we select α = 0.05 and the null is actually true, there is a 5% chance that we make a mistake (reject the null)
We control the probability of committing a Type I error by carefully selecting α
A large p-value indicates either
1. Failing to reject is a good call (null hypothesis being true was a good assumption)
2. But this could also be due to chance. Failing to reject this is an error
3. If fail to reject H₀, then either
  1. Correct decision because the null hypothesis is true
  2. Type II error (the null hypothesis is false, but we fail to reject it)
β represents the probability of a Type II error and is defined as β = P(Type II error) = P(failed to reject H₀ even though H₀ is false)
Cannot directly choose β to be small (e.g., 0.05) to control the probability of committing a Type II error because β depends on several factors, including
1. The sample size n
2. The level of significance (α)
3. the research hypothesis

If we increase α, say from 0.05 to 0.1, we are making it easier to reject the null
- That means, we are more likely to get Type I errors too
- We are less likely to make Type II errors
- Because power = 1 - β, we increase our power
To reduce both types of errors, increase the sample size

New cards

Power and tradeoff between α and β

(Statistical) power = 1 - β
- Statistical Power = Probability that the test correctly rejects a false null hypothesis = P(reject H₀ when H₀ is false)
From now on, we must recognize that when we do not reject H₀, it may be very likely that we are committing a Type II error
- Type II errors happen very often with small sample sizes
For a given sample size, if α goes up, α goes down and vice versa
- Tradeoff between α and β
Only way to obtain fewer both types of errors is to conduct the experiment/collect data with large sample size

New cards

Connection between CIs and hypothesis test

Statistical hypothesis testing answers this question: Is there significant (statistical) evidence of a difference? OR, how likely is this difference is not chance?
What a conclusion that a result is statistically significant does NOT mean:
- Does NOT mean the difference is large enough to be interesting
- Does NOT mean the results are intriguing enough to be worthy of further investigations
- Does NOT mean that the finding is scientifically or clinically significant
Confidence intervals can yield the size of an effect, i.e., can help think about clinical significance

Hypothesis tests can be equivalent results to a confidence interval:

If a 95% CI does not contain the value of the null hypothesis, then the result must be statistically significant, with p < 0.05
If a 95% CI does contain the value of hypothesis, then the result must not be statistically significant, with p > 0.05
Applies to other confidence levels as well
1. If the 99% Cl does not contain the null hypothesis value, then the p value must be less than 0.01
2. If the 90% CI does not contain the null hypothesis value, then the p value must be less than 0.1
X% CI will yield similar results as (100-X)% alpha-based hypothesis test

New cards

Assumptions needed for:

One-sample test for mean
- Random sample
- Sample size condition
- Independent samples
- Continuous variable approximately normally distributed
One-sample test for proportion
- Random sample
- Sample size condition
- Independent samples
- Binary variable

New cards

Nonparametric Tests

Parametric tests involve specific probability distributions (e.g., the normal distribution and the t distribution) and the tests require estimation of parameters of the distribution (e.g., the mean) from the sample data
Nonparametric tests are based on fewer assumptions and do not assume a particular form of distribution (e.g., normal) for the population distribution
Appropriate when the approximate normal assumption of the parametric test fails

New cards

Comments

If data are approximately normally distributed, then parametric tests are more appropriate
- We can still use non-parametric tests but parametric tests are generally more powerful
The techniques we describe here apply to outcomes that are
- Ordinal
- Ranked
- Continuous and are not normally distributed

New cards

Ranking Data

Similar to how median users ranks (order rather than the balues themselves), nonparametric tests use ranks
Rank data
Perform analysis on ranks rather than on original data
Follow same process for hypothesis testing

New cards

Nonparametric test for one sample test for mean

Wilcoxon test for one sample
One sample of independent continuous/ordinal observations
- H₀: Location parameter of distribution is null value
- H₁: Location parameter of distribution is NOT null value
Some investigators interpret this test as comparing the median between the population of interest and some null value
- Analogous to one sample t-test (or z-test) for means (H0: one population mean is some null value)
H₀: location = somevalue
H₁: location ≠ somevalue

New cards

Parametric tests vs non-parametric tests: Tradeoffs

The cost of fewer assumptions is that nonparametric tests are generally less powerful than their parametric counterparts
- Power = 1 - β. Probability that the test correctly rejects a false null hypothesis
- When the null is false, nonparametric tests may be les likely to reject H₀ than parametric tests if parametric tests can be used