AP Statistics Reference Guide: Confidence Intervals and Significance Tests

One-Sample and Two-Sample Confidence Intervals for Proportions

  • One-Sample z-Interval for a Proportion     * Statistic: Represents the sample proportion, denoted as p^\hat{p}.     * Parameter: Represents the population proportion, denoted as pp.     * Conditions for Inference:         * Randomness: The data must come from a random sample.         * Independence (10% Rule): The sample size nn must be less than or equal to 10%10\% of the population size (n10%Nn \le 10\%N).         * Large Counts: The number of successes and failures must both be at least 10, specifically np^10n\hat{p} \ge 10 and n(1p^)10n(1 - \hat{p}) \ge 10.     * Formula: p^±zp^(1p^)n\hat{p} \pm z^* \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}     * Calculator Command: 1-PropZInt

  • Two-Sample z-Interval for a Difference in Proportions     * Statistic: The difference between two sample proportions, denoted as p^1p^2\hat{p}_1 - \hat{p}_2.     * Parameter: The difference between two population proportions, denoted as p1p2p_1 - p_2.     * Conditions for Inference:         * Randomness/Independence: Requires independent random samples or a randomized experiment.         * Independence (10% Rule): For both samples, the size must be less than or equal to 10%10\% of their respective populations (n110%N1n_1 \le 10\%N_1 and n210%N2n_2 \le 10\%N_2).         * Large Counts: Successes and failures for both groups must be at least 10: n1p^110n_1\hat{p}_1 \ge 10, n1(1p^1)10n_1(1 - \hat{p}_1) \ge 10, n2p^210n_2\hat{p}_2 \ge 10, and n2(1p^2)10n_2(1 - \hat{p}_2) \ge 10.     * Formula: (p^1p^2)±zp^1(1p^1)n1+p^2(1p^2)n2(\hat{p}_1 - \hat{p}_2) \pm z^* \sqrt{\frac{\hat{p}_1(1 - \hat{p}_1)}{n_1} + \frac{\hat{p}_2(1 - \hat{p}_2)}{n_2}}     * Calculator Command: 2-PropZInt

Confidence Intervals for Means and Slope

  • One-Sample t-Interval for a Mean (Including Paired t-Interval)     * Statistic: The sample mean, denoted as xˉ\bar{x}.     * Parameter: The population mean, denoted as μ\mu.     * Conditions for Inference:         * Randomness: Data must come from a random sample or a randomized experiment.         * Independence (10% Rule): Sample size must satisfy n10%Nn \le 10\%N.         * Normality/Large Sample: The population distribution must be approximately normal (either given by the problem or sample data must show no strong skew or outliers) OR the sample size must be at least 30 (n30n \ge 30).     * Formula: xˉ±tsn\bar{x} \pm t^* \frac{s}{\sqrt{n}}     * Degrees of Freedom (dfdf): df=n1df = n - 1     * Calculator Command: TInterval

  • Two-Sample t-Interval for a Difference in Means     * Statistic: The difference between two sample means, denoted as xˉ1xˉ2\bar{x}_1 - \bar{x}_2.     * Parameter: The difference between two population means, denoted as μ1μ2\mu_1 - \mu_2.     * Conditions for Inference:         * Randomness/Independence: Independent random samples or a randomized experiment.         * Independence (10% Rule): For both groups, n110%N1n_1 \le 10\%N_1 and n210%N2n_2 \le 10\%N_2.         * Normality/Large Sample: For each group, the population distribution must be approximately normal (given or sample data shows no strong skew/outliers) OR each group's sample size must be at least 30 (n30n \ge 30).     * Formula: (xˉ1xˉ2)±ts12n1+s22n2(\bar{x}_1 - \bar{x}_2) \pm t^* \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}     * Degrees of Freedom (dfdf): Conservatively calculated as the smaller of n11n_1 - 1 and n21n_2 - 1, or determined more accurately via technology.     * Calculator Command: 2-SampTInt

  • t-Interval for a Slope     * Statistic: The sample slope, denoted as bb.     * Parameter: The population slope, denoted as β\beta.     * Conditions for Inference:         * Linearity: The relationship between xx and yy must be fairly linear.         * Independence (10% Rule): Sample size must satisfy n10%Nn \le 10\%N.         * Normality: For each value of xx, the distribution of yy must be approximately normal.         * Equal Variance: For each value of xx, the variable yy must have the same standard deviation.         * Randomness: Data must come from a random sample or randomized experiment.     * Formula: b±tSEbb \pm t^* SE_b     * Degrees of Freedom (dfdf): df=n2df = n - 2     * Calculator Command: LinRegTInt

Significance Tests for Proportions

  • One-Sample z-Test for a Proportion     * Null Hypothesis (H0H_0): H0:p=p0H_0: p = p_0     * Conditions for Inference:         * Randomness: Random sample.         * Independence (10% Rule): n10%Nn \le 10\%N.         * Large Counts: Based on the null value (p0p_0), np010np_0 \ge 10 and n(1p0)10n(1 - p_0) \ge 10.     * Test Statistic Formula: z=p^p0p0(1p0)nz = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}}     * Calculator Command: 1-PropZTest

  • Two-Sample z-Test for a Difference in Proportions     * Null Hypothesis (H0H_0): H0:p1p2=0H_0: p_1 - p_2 = 0     * Conditions for Inference:         * Randomness/Independence: Independent random samples or localized randomized experiment.         * Independence (10% Rule): n110%N1n_1 \le 10\%N_1 and n210%N2n_2 \le 10\%N_2.         * Large Counts: Based on the pooled proportion p^c\hat{p}_c, where p^c=x1+x2n1+n2\hat{p}_c = \frac{x_1 + x_2}{n_1 + n_2}. The conditions are: n1p^c10n_1\hat{p}_c \ge 10, n1(1p^c)10n_1(1 - \hat{p}_c) \ge 10, n2p^c10n_2\hat{p}_c \ge 10, and n2(1p^c)10n_2(1 - \hat{p}_c) \ge 10.     * Test Statistic Formula: z=(p^1p^2)0p^c(1p^c)n1+p^c(1p^c)n2z = \frac{(\hat{p}_1 - \hat{p}_2) - 0}{\sqrt{\frac{\hat{p}_c(1 - \hat{p}_c)}{n_1} + \frac{\hat{p}_c(1 - \hat{p}_c)}{n_2}}}     * Calculator Command: 2-PropZTest

Significance Tests for Means and Slope

  • One-Sample t-Test for a Mean (Including Paired t-Test)     * Null Hypothesis (H0H_0): H0:μ=μ0H_0: \mu = \mu_0     * Conditions for Inference:         * Randomness: Random sample or randomized experiment.         * Independence (10% Rule): n10%Nn \le 10\%N.         * Normality/Large Sample: Population is approximately normal (given or no skew/outliers in sample data) OR n30n \ge 30.     * Test Statistic Formula: t=xˉμ0snt = \frac{\bar{x} - \mu_0}{\frac{s}{\sqrt{n}}}     * Degrees of Freedom (dfdf): df=n1df = n - 1     * Calculator Command: T-Test

  • Two-Sample t-Test for a Difference in Means     * Null Hypothesis (H0H_0): H0:μ1μ2=0H_0: \mu_1 - \mu_2 = 0     * Conditions for Inference:         * Randomness/Independence: Independent random samples or randomized experiment.         * Independence (10% Rule): n110%N1n_1 \le 10\%N_1 and n210%N2n_2 \le 10\%N_2.         * Normality/Large Sample: For each group, the population is approximately normal (given or no skew/outliers) OR n30n \ge 30.     * Test Statistic Formula: t=(xˉ1xˉ2)(μ1μ2)s12n1+s22n2t = \frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}     * Degrees of Freedom (dfdf): Smaller of n11n_1 - 1 and n21n_2 - 1, or calculated by technology.     * Calculator Command: 2-SampTTest

  • t-Test for a Slope     * Null Hypothesis (H0H_0): H0:β=β0H_0: \beta = \beta_0     * Conditions for Inference:         * Linearity: Relationship between xx and yy is linear.         * Independence (10% Rule): n10%Nn \le 10\%N.         * Normality: Y-distribution is approximately normal for each xx.         * Equal Variance: Constant standard deviation of yy for all xx.         * Randomness: Random sample or experiment.     * Test Statistic Formula: t=bβ0SEbt = \frac{b - \beta_0}{SE_b}     * Degrees of Freedom (dfdf): df=n2df = n - 2     * Calculator Command: LinRegTTest

Chi-Square tests

  • Chi-Square (χ2\chi^2) Test for Goodness-of-Fit     * Hypotheses:         * Null Hypothesis (H0H_0): The claimed distribution of the categorical variable is correct.         * Alternative Hypothesis (HaH_a): The claimed distribution of the categorical variable is incorrect.     * Conditions for Inference:         * Randomness: Data comes from a random sample or randomized experiment.         * Independence (10% Rule): n10%Nn \le 10\%N.         * Expected Counts: All expected counts must be greater than 5.     * Formula: χ2=(observedexpected)2expected\chi^2 = \sum \frac{(\text{observed} - \text{expected})^2}{\text{expected}}     * Degrees of Freedom (dfdf): df=number of categories1df = \text{number of categories} - 1     * Calculator Command: χ²GOF-Test

  • Chi-Square (χ2\chi^2) Test for Homogeneity     * Hypotheses:         * Null Hypothesis (H0H_0): There is no difference in the distribution of the categorical variable across populations or treatments.         * Alternative Hypothesis (HaH_a): There is a difference in the distribution of the categorical variable across populations or treatments.     * Conditions for Inference:         * Randomness: Random samples from each population or a randomized experiment.         * Independence (10% Rule): n10%Nn \le 10\%N.         * Expected Counts: All expected counts must be greater than 5.     * Formula: χ2=(observedexpected)2expected\chi^2 = \sum \frac{(\text{observed} - \text{expected})^2}{\text{expected}}     * Degrees of Freedom (dfdf): df=(number of rows1)×(number of columns1)df = (\text{number of rows} - 1) \times (\text{number of columns} - 1)     * Calculator Command: χ²-Test

  • Chi-Square (χ2\chi^2) Test for Independence     * Hypotheses:         * Null Hypothesis (H0H_0): There is no association between two categorical variables in a given population (i.e., the variables are independent).         * Alternative Hypothesis (HaH_a): Two categorical variables in a population are associated (i.e., the variables are dependent).     * Conditions for Inference:         * Randomness: Data comes from a random sample or randomized experiment.         * Independence (10% Rule): n10%Nn \le 10\%N.         * Expected Counts: All expected counts must be greater than 5.     * Formula: χ2=(observedexpected)2expected\chi^2 = \sum \frac{(\text{observed} - \text{expected})^2}{\text{expected}}     * Degrees of Freedom (dfdf): df=(number of rows1)×(number of columns1)df = (\text{number of rows} - 1) \times (\text{number of columns} - 1)     * Calculator Command: χ²-Test matches the homogeneity command.