Power and Sample Size Calculations
Introduction
- Power and sample size calculations are crucial for research studies.
- Statistical power indicates the chance of detecting a statistically significant effect when one exists.
- It's standard practice to report sample size calculations in research publications.
Basic Concepts
- Collaboration between statisticians and researchers is essential for sample size calculation.
- Key questions to consider:
- Study groups or treatment arms.
- Expected data distribution.
- Anticipated means, medians, and ranges.
- Variation in measurements.
- Costs.
- Maximum numbers of subjects and sites.
- Consequences of Type I and Type II errors.
- Type I error: Rejecting the null hypothesis when it is true.
- Type II error: Failing to reject the null hypothesis when it is false.
- Statistical power (1 - b): Probability of rejecting the null hypothesis when the alternative hypothesis is true.
- Desirable to have at least 80% power.
- Sample size calculations can:
- Calculate power for a fixed sample size.
- Calculate the required sample size for a fixed power.
Notational Conventions
- Greek letters represent population parameters (e.g., population mean \mu, variance \sigma^2).
- Latin letters represent sample statistics (e.g., sample mean x, variance s^2).
Review of the Normal and t-Distributions
- Normal distribution (bell-shaped curve) is often used to model continuous data.
- Standard Normal distribution has a mean of zero and a variance of one.
- Z_c represents the c \cdot 100 percentile of the standard Normal distribution.
- t-distribution is used when the population variance is unknown.
- T_{f,c} represents the c \cdot 100 percentile of the t-distribution with f degrees of freedom.
Sample Size Calculations for Precision in Confidence Interval Construction
- Calculations depend on the chosen significance level \alpha but not on power.
- Sample size required to construct a confidence interval of width w:
- Known variance: n \geq 4Z_{1-\alpha/2}^2 \sigma^2/w^2
- Unknown variance (using t-distribution): solve for n in 2T{n-1; 1-\alpha/2} sh/\sqrt{n} = w
Confidence Intervals for Binomial Proportions
- Sample size required to construct a confidence interval of width w for an unknown binomial proportion p:
- n \geq 4Z{1-\alpha/2}^2 ph(1-p_h)/w^2
Sample Size Calculations for Hypothesis Tests: One Sample of Data
- Specific alternative hypothesis is required.
Calculations for Continuous Data Regarding a Single Population Mean
- Hypotheses: H0 : \mu = \mu0 vs. H1 : \mu = \mu1
- Required sample size: n = (Z{1-\alpha/2} + Z{1-\beta})^2 \sigma^2/d^2 where d = \mu1 - \mu0
- For a one-sided test, replace Z{1-\alpha/2} by Z{1-\alpha}
Calculations for Binary Data Regarding a Single Population Proportion
- Hypotheses: H0 : p = p0 vs. H1 : p = p1
- Required sample size: n = [Z{1-\alpha/2} \sqrt{p0(1-p0)} + Z{1-\beta} \sqrt{p1(1-p1)}]^2/d^2 where d = p1 - p0
Two-Stage Designs for a Single Population Proportion
- Sequentially enroll subjects in two stages, stopping early if evidence is not promising.
Sample Size Calculations for Hypothesis Tests: Paired Data
- Pairing may result from before-and-after treatment measurements or matched characteristics.
Calculations for Paired Continuous Data
- Hypotheses: H0 : \mud = \mu0 vs. H1 : \mud = \mu1
- Required sample size: n = (Z{1-\alpha/2} + Z{1-\beta})^2 \sigmad^2/d^2 where d = \mu1 - \mu_0
Calculations for Paired Binary Data
- Hypotheses: H0 : p{10} - p{01} = 0 vs. H1 : p{10} - p{01} = d
- Required sample size: n = [Z{1-\alpha/2} \sqrt{q} + Z{1-\beta} \sqrt{q - d^2}]^2/d^2
Sample Size Calculations for Hypothesis Tests: Two Independent Samples
- Estimating the difference of two population means or proportions.
Calculations for Continuous Data With Equal Variances and Equal Sample Sizes
- Hypotheses: H0 : \muY - \muX = d0 vs. H1 : \muY - \muX = d1
- Required sample size in each group: ne = 2(Z{1-\alpha/2} + Z{1-\beta})^2 \sigmac^2/d^2 where d = d1 - d0
Calculations for Continuous Data With Unequal Variances or Unequal Sample Sizes
- Unequal variances: ne = (Z{1-\alpha/2} + Z{1-\beta})^2(\sigmaX^2 + \sigma_Y^2)/d^2
- Unequal sample sizes: nX = (Z{1-\alpha/2} + Z{1-\beta})^2 (\sigmaX^2 + \sigmaY^2/l)/d^2 where l = nY/n_X
Calculations for Two Independent Samples of Binary Data
- Hypotheses: H0 : pB - pA = 0 vs. H1 : pB - pA = d
- Required sample size in each group: ne = [Z{1-\alpha/2} \sqrt{u0} + Z{1-\beta} \sqrt{u_1}]^2/d^2
- u_0 = 2q(1-q)
- u1 = qA(1-qA) + qB(1-q_B)
Advanced Methods and Other Topics
- Advanced methods and alternative statistics exist for more accurate sample size calculations.
Several Advanced Study Designs
- Multiarm clinical trials: Bonferroni correction.
- Group sequential trials: Interim analyses.
- Survival analysis: Kaplan-Meier, Cox regression.
- Group randomized trials (GRTs): Intraclass correlation (ICC).
- Generalized linear models (GLMs): Complex calculations, simplified methods for approximation.
Retention of Subjects
- Plan for a 10% to 20% rate of retention loss.
- Inflate the calculated sample size by a factor of 1/(1-r)^2
Statistical Computing
- Statistical software facilitates rapid calculations and numerical methods.
- Data simulation, power graphing.
Conclusion
- Statistical power is key in sample size calculations.
- Collaboration between statisticians and researchers is rewarding.