Concise Notes on Hypothesis Testing

Introduction to Hypothesis Testing

  • Hypothesis testing is fundamental for statistical analyses and interpreting clinical studies.
  • Essential for study design and review processes.
  • Framework for comparing effects or treatments in a structured manner.

Key Statistical Concepts

  • Parameter: Descriptive measure from a population (e.g., population mean, median, standard deviation).
  • Statistic: Descriptive measure from a sample (e.g., sample mean, median, standard deviation).
  • Standard Error: Standard deviation of the sample mean.
  • Statistical Inference: Making inferences about a population based on sample statistics.

Statistical Estimation

  • Point Estimation: Determining a specific value for a population parameter.
  • Example: Baseline mean of 7.26 lesions per month in the beta-interferon study.
  • Interval Estimation: Quantifying uncertainty with an interval (e.g., 95% Confidence Interval).
  • Example: 95% CI of (3.83, 10.67) for baseline mean number of lesions in the beta-interferon study.
  • CIs provide an idea of the variability of the treatment effect.

Basic Concepts in Hypothesis Testing

  • Null Hypothesis (H0): Typically a statement of no effect or equality between groups. The negation of the research question.
    • Example: H0:μ<em>1=μ</em>2H0: \mu<em>1 = \mu</em>2
  • Alternative Hypothesis (H1 or HA): States that the null hypothesis is not true.
    • Two-Sided Test: HA:μ<em>1μ</em>2HA: \mu<em>1 \neq \mu</em>2 (detects any difference).
    • One-Sided Test: HA:μ<em>1>μ</em>2HA: \mu<em>1 > \mu</em>2 (detects difference in one direction only).
  • Test Statistic: A value calculated from sample data to compare with a known distribution under the null hypothesis.
    • General form: Teststatistic=pointestimateofμtargetvalueofμknownvalueorpointestimateofsTest statistic = \frac{point estimate of \mu - target value of \mu}{known value or point estimate of s}

Errors in Hypothesis Testing

  • Type I Error: Rejecting the null hypothesis when it is true.
    • Probability denoted by alpha\\alpha (significance level).
  • Type II Error: Failing to reject the null hypothesis when the alternative hypothesis is true.
    • β\beta = P (Type II error).
  • Power: Probability of rejecting the null hypothesis when the alternative hypothesis is true.
    • Power=1β=1P(typeIIerror)Power = 1 - \beta = 1 - P(type II error)
  • P-value: The probability of observing a test statistic as extreme or more extreme than observed if the null hypothesis is true.
    • If p-value < alpha\\alpha, reject the null hypothesis.

One-Sample Hypothesis Tests

  • Used when comparing a statistic from one group to a known value.

Tests for Normal Continuous Data

  • Null and Alternative Hypotheses: H<em>0:μ</em>x=μ<em>0H<em>0: \mu</em>x = \mu<em>0 vs. H</em>A:μ<em>xμ</em>0H</em>A: \mu<em>x \neq \mu</em>0
  • Z-test: Used when σx\sigma_x is known.
    • Test statistic: Z=(xˉμ<em>0)(σ</em>x/n)Z = \frac{(\bar{x} - \mu<em>0)}{(\sigma</em>x / \sqrt{n}) }
  • T-test: Used when σx\sigma_x is unknown.
    • Test statistic: T=xˉμ<em>0s</em>x/nT = \frac{\bar{x} - \mu<em>0}{s</em>x / \sqrt{n}}, where s<em>x=1n1</em>i=1n(xixˉ)2s<em>x = \sqrt{\frac{1}{n-1} \sum</em>{i=1}^{n} (x_i - \bar{x})^2}
Determining Statistical Significance
  • Critical Values: Cut points used to determine statistical significance.
    • Compare the observed test statistic to the critical values.
Confidence Intervals
  • For general \\&alpha a 100 * (1 - \\&alpha)% CI for a population parameter is formed around the point estimate of interest
    • If variance is known: [xˉz<em>1α/2sn,xˉ+z</em>1α/2sn][\bar{x} - z<em>{1-\alpha/2} \frac{s}{\sqrt{n}}, \bar{x} + z</em>{1-\alpha/2} \frac{s}{\sqrt{n}}]

Binary Data

  • Data with two possible outcomes (success/failure).
  • Test Statistic: Z=p<em>1^p</em>0p<em>0(1p</em>0)nZ = \frac{\hat{p<em>1} - p</em>0}{\sqrt{\frac{p<em>0(1-p</em>0)}{n}}}
Exact Tests
  • Useful for smaller sample sizes, when CLT is suspect
Confidence Intervals
  • Clopper-Pearson is a classical approach to get better binomial CIs

Two-Sample Hypothesis Tests

Tests for Comparing the Means of Two Normal Populations

Paired Data
  • Suitable for data like the beta-interferon/MRI trial (measurements before and after treatment).
  • Test statistic: T=dˉs/nT = \frac{\bar{d}}{s/ \sqrt{n}}
Unpaired Data

H<em>0:μ</em>1=μ<em>2 vs. H</em>A:μ<em>1μ</em>2H<em>0 : \mu</em>1 = \mu<em>2 \text{ vs. } H</em>A : \mu<em>1 \neq \mu</em>2
When σ\sigma is known, Z=xˉyˉs1n+1mZ = \frac{\bar{x} - \bar{y}}{s \sqrt{\frac{1}{n} + \frac{1}{m}}} has the standard normal distribution.

Otherwise, it is estimated from data as follows: T=xˉyˉs1n+1mT = \frac{\bar{x} - \bar{y}}{s \sqrt{\frac{1}{n} + \frac{1}{m}}}
which has Student’s t distribution with n+m-2 df
It is possible that equal variance in the two groups is not a good assumption. One can perform a Welch's test.

Tests for Comparing Two Population Proportions
  • The data should be binary
    Test statistic Z=p<em>1^p</em>2^p<em>1^<em>(1p</em>1^)/n+p<em>2^</em>(1p</em>2^)/mZ = \frac{\hat{p<em>1} - \hat{p</em>2}}{\sqrt{\hat{p<em>1}<em>(1-\hat{p</em>1})/ n + \hat{p<em>2}</em>(1-\hat{p</em>2})/ m}}
    This has approximately the standard normal distribution.

Common Mistakes in Hypothesis Testing

  • Ignoring pairing or dependence between observations.
  • Assuming equal variances without verification.
  • t-test on highly skewed data (parametric test vs non-parametric test)

Misstatements and Misconceptions

  • Failing to reject the null hypothesis means that it is true.
  • small p-value means that that the two sample means (x and y) are significantly different from each other
    Both a statistically significant finding and a clinically significant finding is needed to interprete the data.

Comparing More Than Two Groups: One-Way Analysis of Variance

  • An ANOVA framework can be done with multiple means from multiple populations if interested in detecting any differences among the various treatments in those groups.

Simple and Multiple Linear Regression

  • Hypothesis: H_0 : b1 = 0 vs. HA : b1\neq 0:$$

Multiple Comparisons: When doing multiple comparisons/hypothesis tests

  • Solution: Choose a lower significane level to prevent false postivie conclusions or to “control the false discovery rate.”