Power and Sample Size Calculations

Introduction

Power and sample size calculations are crucial for research studies.
Statistical power indicates the chance of detecting a statistically significant effect when one exists.
It's standard practice to report sample size calculations in research publications.

Collaboration between statisticians and researchers is essential for sample size calculation.
Key questions to consider:
- Study groups or treatment arms.
- Expected data distribution.
- Anticipated means, medians, and ranges.
- Variation in measurements.
- Costs.
- Maximum numbers of subjects and sites.
- Consequences of Type I and Type II errors.
Type I error: Rejecting the null hypothesis when it is true.
Type II error: Failing to reject the null hypothesis when it is false.
Statistical power (1 - b): Probability of rejecting the null hypothesis when the alternative hypothesis is true.
Desirable to have at least 80% power.
Sample size calculations can:
- Calculate power for a fixed sample size.
- Calculate the required sample size for a fixed power.

Greek letters represent population parameters (e.g., population mean \mu, variance \sigma^2).
Latin letters represent sample statistics (e.g., sample mean x, variance s^2).

Normal distribution (bell-shaped curve) is often used to model continuous data.
Standard Normal distribution has a mean of zero and a variance of one.
Z_c represents the c \cdot 100 percentile of the standard Normal distribution.
t-distribution is used when the population variance is unknown.
T_{f,c} represents the c \cdot 100 percentile of the t-distribution with f degrees of freedom.

Sample size required to construct a confidence interval of width w:
- Known variance: n \geq 4Z_{1-\alpha/2}^2 \sigma^2/w^2
- Unknown variance (using t-distribution): solve for n in 2T{n-1; 1-\alpha/2} sh/\sqrt{n} = w

Sample size required to construct a confidence interval of width w for an unknown binomial proportion p:
- n \geq 4Z{1-\alpha/2}^2 ph(1-p_h)/w^2

Hypotheses: H0 : \mu = \mu0 vs. H1 : \mu = \mu1
Required sample size: n = (Z{1-\alpha/2} + Z{1-\beta})^2 \sigma^2/d^2 where d = \mu1 - \mu0
For a one-sided test, replace Z{1-\alpha/2} by Z{1-\alpha}

Hypotheses: H0 : p = p0 vs. H1 : p = p1
Required sample size: n = [Z{1-\alpha/2} \sqrt{p0(1-p0)} + Z{1-\beta} \sqrt{p1(1-p1)}]^2/d^2 where d = p1 - p0

Sequentially enroll subjects in two stages, stopping early if evidence is not promising.

Pairing may result from before-and-after treatment measurements or matched characteristics.

Hypotheses: H0 : \mud = \mu0 vs. H1 : \mud = \mu1
Required sample size: n = (Z{1-\alpha/2} + Z{1-\beta})^2 \sigmad^2/d^2 where d = \mu1 - \mu_0

Hypotheses: H0 : p{10} - p{01} = 0 vs. H1 : p{10} - p{01} = d
- given p{10} + p{01} = q
Required sample size: n = [Z{1-\alpha/2} \sqrt{q} + Z{1-\beta} \sqrt{q - d^2}]^2/d^2

Hypotheses: H0 : \muY - \muX = d0 vs. H1 : \muY - \muX = d1
Required sample size in each group: ne = 2(Z{1-\alpha/2} + Z{1-\beta})^2 \sigmac^2/d^2 where d = d1 - d0

Unequal variances: ne = (Z{1-\alpha/2} + Z{1-\beta})^2(\sigmaX^2 + \sigma_Y^2)/d^2
Unequal sample sizes: nX = (Z{1-\alpha/2} + Z{1-\beta})^2 (\sigmaX^2 + \sigmaY^2/l)/d^2 where l = nY/n_X

Hypotheses: H0 : pB - pA = 0 vs. H1 : pB - pA = d
Required sample size in each group: ne = [Z{1-\alpha/2} \sqrt{u0} + Z{1-\beta} \sqrt{u_1}]^2/d^2
- u_0 = 2q(1-q)
- u1 = qA(1-qA) + qB(1-q_B)

Advanced methods and alternative statistics exist for more accurate sample size calculations.

Multiarm clinical trials: Bonferroni correction.
Group sequential trials: Interim analyses.
Survival analysis: Kaplan-Meier, Cox regression.
Group randomized trials (GRTs): Intraclass correlation (ICC).
Generalized linear models (GLMs): Complex calculations, simplified methods for approximation.