Example: Find a 95% confidence interval for the difference in song lengths.
Conditions are met.
Calculate the interval.
Interpretation: We are 95% confident that the difference in song length between folk songs and classic pop and rock songs is between -8.07 and 27.57.
Power
Framework to test if there is a difference between sample means.
Illustrative Example: Pharmaceutical Company Testing New Drug
Pharmaceutical company develops a new drug for lowering blood pressure.
They conduct a clinical trial.
Recruit people taking a standard blood pressure medication.
Control group continues current medication (with generic-looking pills).
Researchers want to run the trial on patients with systolic blood pressures between 140 and 180 mmHg.
Previous studies suggest:
Standard deviation of patients’ blood pressures will be about 12 mmHg.
The distribution of patient blood pressures will be approximately symmetric.
If we had 100 patients per group, the approximate standard error would be: SE = \sqrt{\frac{12^2}{100} + \frac{12^2}{100}} = 1.70.
Detecting a Difference
Determine values of \bar{x}{treatment} - \bar{x}{control} that would lead to rejecting the null hypothesis.
Assume \alpha = 0.05 (two-sided test).
Reject if the difference is in the lower 2.5% or upper 2.5%.
Assuming a Normal distribution, any difference below -1.96 * 1.70 = -3.332 or above 1.96 * 1.70 = 3.332 would be in the rejection region.
Suppose the new drug reduces blood pressure by 3 mmHg relative to the standard medication.
Finding the Probability of Detecting a Difference
Called the "power of a test."
Depends on the size of a difference we want to detect, sample size, and standard deviation.
Effect size is the difference we are looking for.
Connection with Type II Error
Type I error: Reject the null hypothesis when it’s actually true.
Type II error: Fail to reject the null hypothesis when it’s not true.
\alpha = probability of making Type I error.
\beta = probability of making a Type II error.
Power of a test = 1 - \beta
We can set the probability of making a Type I error using the alpha level.
We have less control over the probability of making a Type II error, but we can measure it and account for it using the power.
Using Power Calculations
Determine the power of a test to find a sample size that gives enough power to detect a minimum effect size.
Power in Blood Pressure Medication Test
With a sample size of 100 in each group to detect an effect size of 3 mmHg, the power of the test was 0.42.
Increase sample size to be able to detect it.
Finding Sample Sizes for Blood Pressure Medication
Find the sample size that gives a power of 80% with an effect size of 3 mmHg.
Find the z-score in the true sampling distribution that gives us 80% below.
We need a standard error such that a z-score of 0.84 in the true sampling distribution is the same as a z-score of -1.96 in the null sampling distribution