Week 5 (biostats) - CI & Hypothesis Testing
Confidence Intervals (CI)
- An interval around a sample statistic that likely captures the population parameter.
- Common level: 95% (also 90%, 99%).
- General form: \text{CI} = \text{sample statistic} \pm \text{multiplier} \times \text{SE}
- Upper / Lower limits shown as the interval around the statistic.
Hypothesis Testing Overview
- Hypothesis: a statement about the population; use sample data to infer about the population.
- Purpose: determine if observed sample results could be due to chance.
- Two main hypotheses:
- Null: H0: \mu1 - \mu_2 = 0 (no difference).
- Alternative: Ha: \mu1 - \mu_2 \neq 0 (two-sided) or one-sided variants.
- Errors (when making a decision about H_0):
- Type I error: reject H_0 when it is true.
- Type II error: fail to reject H_0 when it is false.
- P-values describe evidence against H_0; not a direct measure of importance.
- Typical threshold: p < 0.05 indicates significance; if p \ge 0.05, insufficient evidence to reject H_0. Do not say "accept" the null.
Steps of Hypothesis Testing
- State the study, objectives, and design.
- State hypotheses (null and alternative); decide on one- or two-sided; justify.
- State assumptions and check them.
- Analyze the data: compute test statistic, obtain the p-value, and calculate a 95% CI.
- Discuss results and infer about the population.
Two-Sample (Independent) t-Test: Equal vs. Unequal Variances
- Data: two independent groups, continuous outcome.
- Difference in means: \Delta = \bar{x}1 - \bar{x}2
Equal variances (pooled SD)
- Degrees of freedom: \text{df} = n1 + n2 - 2
- Pooled SD squared: Sp^2 = \frac{(n1-1)SD1^2 + (n2-1)SD2^2}{n1 + n_2 - 2}
- Standard Error: SE = \sqrt{Sp^2\left(\frac{1}{n1} + \frac{1}{n_2}\right)}
- t-statistic: t = \frac{\bar{x}1 - \bar{x}2}{SE}
- 95% CI: (\bar{x}1 - \bar{x}2) \pm t^* \cdot SE where t^* is the 2-sided critical value for the given df.
Unequal variances (Welch)
- Standard Error: SE = \sqrt{\frac{SD1^2}{n1} + \frac{SD2^2}{n2}}
- Degrees of freedom: Welch–Satterthwaite approximation (df not equal to n1+n2-2; use appropriate table/software).
- 95% CI use the corresponding t* with the Welch df.
- Decision via p-value from the t-statistic with Welch df.
Checking Variances (Equal vs. Unequal SD)
- Practical check: compare SDs/variances.
- Rule of thumb: ratio of variances = \frac{SD1^2}{SD2^2}
- If ratio < 2, assume equal variances.
- If ratio ≥ 2, assume unequal variances.
- Statistical check (Method 3): Use software (e.g., GraphPad Prism) to test H0: equal SDs vs Ha: unequal SDs.
- If p-value > 0.05, fail to reject H0 (assume equal variances).
- If p-value < 0.05, reject H0 (assume unequal variances).
Interpreting P-Values and Levels of Evidence
- p-value interpretation: probability, under H_0, of observing data as extreme or more extreme than what was observed.
- Levels of evidence (illustrative):
- p = 0.05: weak evidence against H_0
- p = 0.01: increasing evidence
- p = 0.001: strong evidence
- p = 0.0001: very strong evidence
- Example guidance: See common p-value interpretations (e.g., p-value = 0.36 = insufficient evidence; p-value = 0.00014 = strong evidence).
Example: Birth Weights (Independent Two-Sample t-Test)
- Data: Heavy smokers (n1=14), Non-smokers (n2=15); means 3.1743 kg and 3.6267 kg; SDs 0.4631 and 0.3584.
- Step 3 calculations:
- Difference: \Delta = \bar{x}1 - \bar{x}2 = -0.4524\ \text{kg}
- SE (pooled equal variances): SE = 0.15317
- df: 27
- t* (95% CI): t^* = 2.05
- 95% CI: \Delta \pm t^*\cdot SE = -0.4524 \pm 2.05 \times 0.15317 \Rightarrow [-0.77, -0.14]\ \text{kg}
- Two-sided p-value: between 0.005 and 0.01
- Interpretation: CI does not include 0 and p < 0.05 → significant difference. Heavy smokers have lower birth weight.
Practical Output and Reporting
- Report: means (with SE or SD), df, t-statistic, p-value, and 95% CI for the mean difference.
- Key takeaway: 95% CI for the difference and the p-value together indicate significance and direction of effect.
- Distinguish between t-multiplier and t-statistic:
- t-multiplier: the critical value used to form the 95% CI (depends on df).
- t-statistic: computed from data to test the hypothesis.
- CI for difference (two independent means, equal variances):
\Delta \pm t^* \cdot SE, \quad SE = \sqrt{\left(\frac{(n1-1)SD1^2 + (n2-1)SD2^2}{n1+n2-2}\right)\left(\frac{1}{n1} + \frac{1}{n2}\right)} - df (equal variances): \text{df} = n1 + n2 - 2
- t-statistic (equal variances): t = \frac{\bar{x}1 - \bar{x}2}{SE}
- Pooled SD: Sp = \sqrt{\frac{(n1-1)SD1^2 + (n2-1)SD2^2}{n1+n_2-2}}
- Unequal variances: SE = \sqrt{\frac{SD1^2}{n1} + \frac{SD2^2}{n2}}
- p-value interpretation: compare with 0.05 cutoff; report as two-sided unless a one-sided test was planned.
Next Steps and Reminders
- Always compute and report both the 95% CI and the p-value.
- Use the appropriate SE formula depending on equal or unequal variances.
- Check assumptions (normality, independence) before choosing the test.
- Use the CI to convey the precision and direction of the effect, not just the p-value.