Comprehensive AP Statistics Study Guide: Significance Tests, Confidence Intervals, and Calculator Operations

Significance Tests for Proportions and Means

In AP Statistics, significance tests are structured procedures for evaluating claims about population parameters based on sample data. For proportions, there are two primary tests. The One-sample z-test for a proportion uses the null hypothesis $H_0: p = p_0$ . The alternative hypothesis $H_a$ can state that the true proportion is greater than, less than, or different from the null value. Conditions for inference include a random sample, the sample size being less than or equal to $10\%$ of the population ( $n \le 10\% N$ ), and the Large Counts condition ( $n p_0 \ge 10$ and $n(1 - p_0) \ge 10$ ). The test statistic is calculated as $z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}}$ . This test is performed on the calculator using the 1-PropZTest function.

The Two-sample z-test for a difference in proportions evaluates the null hypothesis $H_0: p_1 - p_2 = 0$ . It requires independent random samples or a randomized experiment. Conditions specify that samples must be less than $10\%$ of their respective populations ( $n_1 \le 10\% N_1$ and $n_2 \le 10\% N_2$ ). The Large Counts condition for this test uses a pooled (combined) proportion, $\hat{p}_c = \frac{X_1 + X_2}{n_1 + n_2}$ . The requirements are $n_1 \hat{p}_c \ge 10$ , $n_1(1 - \hat{p}_c) \ge 10$ , $n_2 \hat{p}_c \ge 10$ , and $n_2(1 - \hat{p}_c) \ge 10$ . The test statistic is $z = \frac{(\hat{p}_1 - \hat{p}_2) - 0}{\sqrt{\frac{\hat{p}_c(1 - \hat{p}_c)}{n_1} + \frac{\hat{p}_c(1 - \hat{p}_c)}{n_2}}}$ and the calculator function is 2-PropZTest.

For means, the One-sample t-test for a mean (or paired t-test) tests $H_0: \mu = \mu_0$ . It assumes a random sample or randomized experiment and that $n \le 10\% N$ . The Normality condition is satisfied if the population is normal, the sample data show no strong skew or outliers, or the sample size is large enough via the Central Limit Theorem ( $n \ge 30$ ). The test statistic is $t = \frac{\bar{x} - \mu_0}{\frac{s}{\sqrt{n}}}$ with degrees of freedom $df = n - 1$ . The calculator function is T-Test. The Two-sample t-test for a difference in means tests $H_0: \mu_1 - \mu_2 = 0$ . Conditions involve independent random samples, the $10\%$ rule for both populations, and the Normality condition for both groups ( $n_1, n_2 \ge 30$ or normal distributions). The statistic is $t = \frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$ . Degrees of freedom are taken as the smaller of $n_1 - 1$ and $n_2 - 1$ or determined via technology. The calculator function is 2-SampTTest.

Inference for Slope

The t-test for a slope is used to determine if there is a linear relationship between two quantitative variables. The null hypothesis is typically $H_0: \beta = \beta_0$ (often $\beta = 0$ ). Five conditions must be met: 1) the relationship between $x$ and $y$ is fairly linear; 2) the sample size is less than $10\%$ of the population ( $n \le 10\% N$ ); 3) for each $x$ , the distribution of $y$ is normal; 4) for each $x$ , $y$ has approximately the same standard deviation; and 5) the data comes from a random sample or randomized experiment. The test statistic is $t = \frac{b - \beta_0}{SE_b}$ where $b$ is the sample slope and $SE_b$ is the standard error of the slope. The degrees of freedom are $df = n - 2$ . The calculator command is LinRegTTest.

Confidence Intervals for Proportions, Means, and Slope

Confidence intervals estimate the true value of a population parameter. For a single proportion, the One-sample z-interval uses the formula $\hat{p} \pm z^* \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}$ , where the Large Counts condition relies on observed counts ( $n\hat{p} \ge 10$ and $n(1 - \hat{p}) \ge 10$ ). For the difference in proportions, the formula is $(\hat{p}_1 - \hat{p}_2) \pm z^* \sqrt{\frac{\hat{p}_1(1 - \hat{p}_1)}{n_1} + \frac{\hat{p}_2(1 - \hat{p}_2)}{n_2}}$ .

For means, a One-sample t-interval is calculated as $\bar{x} \pm t^* \left(\frac{s}{\sqrt{n}}\right)$ with $df = n - 1$ . A Two-sample t-interval for the difference in means is $(\bar{x}_1 - \bar{x}_2) \pm t^* \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$ . The t-interval for a slope is given by $b \pm t^* SE_b$ with $df = n - 2$ . Each interval requires specific conditions such as randomness, independence ( $10\%$ rule), and normality/large sample size as previously detailed for significance tests. Calculator functions include 1-PropZInt, 2-PropZInt, TInterval, 2-SampTInt, and LinRegTInt.

Chi-Square Tests

Chi-Square tests are applied to categorical data. All three Chi-Square tests share the same basic formula for the test statistic: $\chi^2 = \sum \frac{(O - E)^2}{E}$ , where $O$ is the observed count and $E$ is the expected count. In all cases, all expected counts must be greater than $5$ .

The Chi-Square test for Goodness-of-Fit (GOF) evaluates if a categorical variable's claimed distribution is correct. $H_0$ states the claimed distribution is correct, while $H_a$ states it is incorrect. The degrees of freedom are df = \text{# of categories} - 1. The calculator command is \chi^2GOF-Test.

The Chi-Square test for Homogeneity determines if the distribution of a categorical variable is the same across multiple populations or treatments. $H_0$ states there is no difference in the distribution across populations. The degrees of freedom are calculated as $df = (r - 1)(c - 1)$ where $r$ is rows and $c$ is columns. The calculator command is \chi^2-Test.

The Chi-Square test for Independence checks for an association between two categorical variables within a single population. $H_0$ states there is no association or the variables are independent. The degrees of freedom are also $df = (r - 1)(c - 1)$ , and the calculator command is \chi^2-Test. Observed counts for these tests (Homogeneity and Independence) must be entered into a matrix, typically Matrix A, on the calculator.

Identifying the Correct Significance Test

To choose the appropriate significance test, first look for indicators like the phrase "Do the data provide convincing statistical evidence." If the data involve Minitab output or a linear relationship, a Linear Regression t-test is required. If the data are presented in a table of frequencies, a Chi-Square test is indicated: use Goodness-of-Fit for one variable across one population, Homogeneity for one variable across multiple populations, and Independence for two variables in one population.

If the data are not in a frequency table, determine if they involve Proportions (%, proportions) or Means (averages). For proportions, check the number of samples: one sample requires a 1-prop z-test, while two samples require a 2-prop z-test. For means, check the number of samples: one sample or a paired experiment requires a 1-mean t-test (paired t-test), while two separate groups require a 2-mean t-test.

Calculator Functions and Usage

For One Variable Data, use 1-Var Stats to find the mean, standard deviation, and five-number summary. For Two Variable Data, LinReg (a + bx) provides the least squares regression line, correlation ( $r$ ), and coefficient of determination ( $r^2$ ). Use DiagnosticOn to ensure $r$ and $r^2$ are displayed.

Probability calculations include:

binompdf(n, p, X): Probability of exactly $X$ successes ( $n$ trials, probability $p$ ).
binomcdf(n, p, X): Probability of at most $X$ successes ( $P(X \le x)$ ).
normalcdf(lower, upper, mean, SD): Find the area/probability for an interval in a normal distribution.
invNorm(area left, mean, SD): Find a boundary value for a given area in a normal distribution.
tcdf(lower, upper, df): Find the area for an interval in a t-distribution.
invT(area left, df): Find a boundary value for a t-distribution.
\chi^2cdf(lower, upper, df): Find the area for an interval in a chi-square distribution.

Confidence Intervals and Significance Tests commands require specific inputs such as successes ( $x$ ), sample size ( $n$ ), means ( $\bar{x}$ ), standard deviations ( $s_x$ ), and confidence levels (C-Level). Note that for two-sample t-procedures, the "Pooled" option should generally be set to "No."

Statistical Practice Problems

Proportion Significance Test: A random sample of $100$ students found $54$ support a schedule change. The principal tests if more than $50\%$ support it ( $H_0: p = 0.5$ , $H_a: p > 0.5$ ). Function: 1-PropZTest. Inputs: $p_0 = 0.5$ , $x = 54$ , $n = 100$ , prop $> p_0$ . P-value: $0.2119$ .
Difference in Means Confidence Interval: Samples of burgers in U.S. ( $\bar{x}_1 = 4.53$ , $s_1 = 0.24$ , $n_1 = 15$ ) and Japan ( $\bar{x}_2 = 4.01$ , $s_2 = 0.38$ , $n_2 = 10$ ). Function: 2-SampTInt. Inputs: $\bar{x}_1 = 4.53$ , $s_1 = 0.24$ , $n_1 = 15$ , $\bar{x}_2 = 4.01$ , $s_2 = 0.38$ , $n_2 = 10$ , C-Level: $0.99$ , Pooled: No. Result: $(0.1478, 0.8922)$ .
Normal Boundary Calculation: Croatian males height $X \sim N(180, 7^2)$ . Tallest $5\%$ corresponds to area left = $0.95$ . Function: invNorm. Inputs: area $= 0.95$ , $\mu = 180$ , $\sigma = 7$ . Answer: $191.51\,cm$ .
Binomial Probability: $\text{Natural blackjack probability} = 4.5\%$ . $P(X \ge 3)$ in $20$ rounds. Function: $1 - binomcdf(n, p, X)$ . Inputs: $1 - binomcdf(20, 0.045, 2)$ . Answer: $0.0601$ .
Mean Confidence Interval: Sample of $40$ professors, $\bar{x} = 5.4$ , $s = 1.6$ . Function: TInterval. Inputs: $\bar{x} = 5.4$ , $s = 1.6$ , $n = 40$ , C-Level: $0.95$ . Result: $(4.8872, 5.9128)$ .
Binomial Exact Successes: $3\%$ Siberian Huskies have heterochromia. $P(X = 3)$ in sample of $100$ . Function: binompdf. Inputs: $n = 100$ , $p = 0.03$ , $X = 3$ . Answer: $0.2275$ .
Difference in Proportions Confidence Interval: 2012 sample: $10/10,000$ . 2016 sample: $100/20,000$ . Function: 2-PropZInt. Inputs: $x_1 = 10$ , $n_1 = 10,000$ , $x_2 = 100$ , $n_2 = 20,000$ , C-Level: $0.95$ . Result: $(-0.0054, -0.0026)$ . (Note: Change is $p_{2016} - p_{2012}$ ).
P-value from t-statistic: Test for mean salary $> \$45,327$ with $n = 10$ and $t = 2.51$ . Function: tcdf(lower, upper, df). Inputs: tcdf(2.51, 999, 9). Answer: $0.0167$ .
Chi-Square Test Statistic: Vehicles (Car, SUV, Truck) vs. (Owned, Leased). Function: \chi^2-Test. Matrix A: $[[29, 20, 11], [21, 10, 4]]$ . Test statistic (calculated $\chi^2$ ): $1.53$ .
Normal Distribution Probability: GRE Verbal $X \sim N(150, 8.5^2)$ . $P(145 \le X \le 160)$ . Function: normalcdf. Inputs: lower $= 145$ , upper $= 160$ , $\mu = 150$ , $\sigma = 8.5$ . Answer: $0.6053$ .
Significance Test for Difference in Means: Machine A ( $\bar{x}_1 = 1.35$ , $s_1 = 0.10$ , $n_1 = 10$ ), Machine B ( $\bar{x}_2 = 1.42$ , $s_2 = 0.08$ , $n_2 = 10$ ). Testing for a difference ( $H_a: \mu_1 \ne \mu_2$ ). Function: 2-SampTTest. Inputs: Stats, $\bar{x}_1 = 1.35$ , $s_1 = 0.10$ , $n_1 = 10$ , $\bar{x}_2 = 1.42$ , $s_2 = 0.08$ , $n_2 = 10$ , $\mu_1 \ne \mu_2$ , Pooled: No. P-value: $0.0969$ .