Power and Sample Size Calculations

Introduction

  • Power and sample size calculations are crucial for research studies.
  • Statistical power indicates the chance of detecting a statistically significant effect when one exists.
  • It's standard practice to report sample size calculations in research publications.

Basic Concepts

  • Collaboration between statisticians and researchers is essential for sample size calculation.
  • Key questions to consider:
    • Study groups or treatment arms.
    • Expected data distribution.
    • Anticipated means, medians, and ranges.
    • Variation in measurements.
    • Costs.
    • Maximum numbers of subjects and sites.
    • Consequences of Type I and Type II errors.
  • Type I error: Rejecting the null hypothesis when it is true.
  • Type II error: Failing to reject the null hypothesis when it is false.
  • Statistical power (1 - b): Probability of rejecting the null hypothesis when the alternative hypothesis is true.
  • Desirable to have at least 80% power.
  • Sample size calculations can:
    • Calculate power for a fixed sample size.
    • Calculate the required sample size for a fixed power.

Notational Conventions

  • Greek letters represent population parameters (e.g., population mean \mu, variance \sigma^2).
  • Latin letters represent sample statistics (e.g., sample mean x, variance s^2).

Review of the Normal and t-Distributions

  • Normal distribution (bell-shaped curve) is often used to model continuous data.
  • Standard Normal distribution has a mean of zero and a variance of one.
  • Z_c represents the c \cdot 100 percentile of the standard Normal distribution.
  • t-distribution is used when the population variance is unknown.
  • T_{f,c} represents the c \cdot 100 percentile of the t-distribution with f degrees of freedom.

Sample Size Calculations for Precision in Confidence Interval Construction

  • Calculations depend on the chosen significance level \alpha but not on power.

Confidence Intervals for Means of Continuous Data

  • Sample size required to construct a confidence interval of width w:
    • Known variance: n \geq 4Z_{1-\alpha/2}^2 \sigma^2/w^2
    • Unknown variance (using t-distribution): solve for n in 2T{n-1; 1-\alpha/2} sh/\sqrt{n} = w

Confidence Intervals for Binomial Proportions

  • Sample size required to construct a confidence interval of width w for an unknown binomial proportion p:
    • n \geq 4Z{1-\alpha/2}^2 ph(1-p_h)/w^2

Sample Size Calculations for Hypothesis Tests: One Sample of Data

  • Specific alternative hypothesis is required.

Calculations for Continuous Data Regarding a Single Population Mean

  • Hypotheses: H0 : \mu = \mu0 vs. H1 : \mu = \mu1
  • Required sample size: n = (Z{1-\alpha/2} + Z{1-\beta})^2 \sigma^2/d^2 where d = \mu1 - \mu0
  • For a one-sided test, replace Z{1-\alpha/2} by Z{1-\alpha}

Calculations for Binary Data Regarding a Single Population Proportion

  • Hypotheses: H0 : p = p0 vs. H1 : p = p1
  • Required sample size: n = [Z{1-\alpha/2} \sqrt{p0(1-p0)} + Z{1-\beta} \sqrt{p1(1-p1)}]^2/d^2 where d = p1 - p0

Two-Stage Designs for a Single Population Proportion

  • Sequentially enroll subjects in two stages, stopping early if evidence is not promising.

Sample Size Calculations for Hypothesis Tests: Paired Data

  • Pairing may result from before-and-after treatment measurements or matched characteristics.

Calculations for Paired Continuous Data

  • Hypotheses: H0 : \mud = \mu0 vs. H1 : \mud = \mu1
  • Required sample size: n = (Z{1-\alpha/2} + Z{1-\beta})^2 \sigmad^2/d^2 where d = \mu1 - \mu_0

Calculations for Paired Binary Data

  • Hypotheses: H0 : p{10} - p{01} = 0 vs. H1 : p{10} - p{01} = d
    • given p{10} + p{01} = q
  • Required sample size: n = [Z{1-\alpha/2} \sqrt{q} + Z{1-\beta} \sqrt{q - d^2}]^2/d^2

Sample Size Calculations for Hypothesis Tests: Two Independent Samples

  • Estimating the difference of two population means or proportions.

Calculations for Continuous Data With Equal Variances and Equal Sample Sizes

  • Hypotheses: H0 : \muY - \muX = d0 vs. H1 : \muY - \muX = d1
  • Required sample size in each group: ne = 2(Z{1-\alpha/2} + Z{1-\beta})^2 \sigmac^2/d^2 where d = d1 - d0

Calculations for Continuous Data With Unequal Variances or Unequal Sample Sizes

  • Unequal variances: ne = (Z{1-\alpha/2} + Z{1-\beta})^2(\sigmaX^2 + \sigma_Y^2)/d^2
  • Unequal sample sizes: nX = (Z{1-\alpha/2} + Z{1-\beta})^2 (\sigmaX^2 + \sigmaY^2/l)/d^2 where l = nY/n_X

Calculations for Two Independent Samples of Binary Data

  • Hypotheses: H0 : pB - pA = 0 vs. H1 : pB - pA = d
  • Required sample size in each group: ne = [Z{1-\alpha/2} \sqrt{u0} + Z{1-\beta} \sqrt{u_1}]^2/d^2
    • u_0 = 2q(1-q)
    • u1 = qA(1-qA) + qB(1-q_B)

Advanced Methods and Other Topics

  • Advanced methods and alternative statistics exist for more accurate sample size calculations.

Several Advanced Study Designs

  • Multiarm clinical trials: Bonferroni correction.
  • Group sequential trials: Interim analyses.
  • Survival analysis: Kaplan-Meier, Cox regression.
  • Group randomized trials (GRTs): Intraclass correlation (ICC).
  • Generalized linear models (GLMs): Complex calculations, simplified methods for approximation.

Retention of Subjects

  • Plan for a 10% to 20% rate of retention loss.
  • Inflate the calculated sample size by a factor of 1/(1-r)^2

Statistical Computing

  • Statistical software facilitates rapid calculations and numerical methods.
  • Data simulation, power graphing.

Conclusion

  • Statistical power is key in sample size calculations.
  • Collaboration between statisticians and researchers is rewarding.