knowt ap exam guide logo

Chapter 8: Confidence Intervals

Introduction

  • Inferential statistics: We use sample data to make generalizations about an unknown population

  • Sample data: help us to make an estimate of a population parameter.

  • Point estimate: a single number computed from a sample and used to estimate a population parameter

    • x¯ is a point estimate for μ

    • p′ is a point estimate for ρ

    • s is a point estimate for σ

  • Confidence interval: an interval estimate for an unknown population parameter. This depends on:

  • Confidence interval form: (point estimate – margin of error, point estimate + margin of error)

  • Empirical rule: Around 68% of values are within 1 standard deviation of the mean. Around 95% of values are within 2 standard deviations of the mean.

  • The margin of error: how many percentages points your results will differ from the real population value

8.1 A Single Population Mean using the Normal Distribution

  • Confidence level: considered the probability that the calculated confidence interval estimate will contain the true population parameter.

  • Alpha level: is the probability that the interval does not contain the unknown population parameter.

  • standard error of the mean: 𝜎 / √n

  • X¯ is normally distributed, that is, X¯~ N(𝜇𝑋 , 𝜎 / √n)

  • Calculating the Confidence Interval

    • Calculate the sample mean 𝑥⎯⎯x¯ from the sample data. Remember, in this section, we already know the population standard deviation σ.

    • Find the z-score that corresponds to the confidence level.

    • Calculate the error-bound EBM.

    • Construct the confidence interval.

    • Write a sentence that interprets the estimate in the context of the situation in the problem. (Explain what the confidence interval means, in the words of the problem.)

  • Finding the z-score for the Stated Confidence Level

    • Each of the tails contains an area equal to 𝛼/2.

    • The z-score that has an area to the right of 𝛼/2 is denoted by 𝑧 𝛼/2.

  • Calculating the Error Bound: EBM = (𝑧 𝛼/2)(𝜎/√n)

  • Confidence level interpretation: "We estimate with ___% confidence that the true population mean (include the context of the problem) is between ___ and ___ (include appropriate units)."

Effect of Changing the Confidence Level

  • Increasing the confidence level increases the error bound, making the confidence interval wider.

  • Decreasing the confidence level decreases the error bound, making the confidence interval narrower.

Effect of Changing the Sample Size

  • Increasing the sample size causes the error bound to decrease, making the confidence interval narrower.

  • Decreasing the sample size causes the error bound to increase, making the confidence interval wider.

Finding the Error Bound

  • From the upper value for the interval, subtract the sample mean,

  • OR, from the upper value for the interval, subtract the lower value. Then divide the difference by two.

Finding the Sample Mean

  • Subtract the error bound from the upper value of the confidence interval,

  • OR, average the upper and lower endpoints of the confidence interval.

8.2 A Single Population Mean using the Student t Distribution

  • Student's t-distribution: a type of probability distribution that is similar to the normal distribution with its bell shape but has heavier tails

  • Standard deviation: a number that is equal to the square root of the variance and measures how far data values are from their mean; notation: s for sample standard deviation and σ for population standard deviation

  • Normal distribution: continuous random variable (RV) with pdf 𝑓(𝑥)=(1 / 𝜎√2𝜋) 𝑒^–(𝑥–𝜇)^2/2𝜎^2, where μ is the mean of the distribution and σ is the standard deviation, notation: X ~ N(μ,σ).

  • Degrees of freedom: the number of objects in a sample that is free to vary

  • df = n - 1: the degrees of freedom for a Student’s t-distribution where n represents the size of the sample

  • The invT command requires two inputs: invT(area to the left, degrees of freedom) The output is the t-score that corresponds to the area we specified.

Properties of the Student's t-Distribution

  • The graph for the Student's t-distribution is similar to the standard normal curve.

  • The mean for the Student's t-distribution is zero and the distribution is symmetric about zero.

  • The Student's t-distribution has more probability in its tails than the standard normal distribution because the spread of the t-distribution is greater than the spread of the standard normal.

  • The exact shape of the Student's t-distribution depends on the degrees of freedom. As the degrees of freedom increase, the graph becomes more like the graph of the standard normal distribution.

  • The underlying population of individual observations is assumed to be normally distributed with an unknown population mean μ and unknown population standard deviation σ.

The notation for the Student's t-distribution (using T as the random variable) is:

  • T ~ tdf where df = n – 1.

  • For example, if we have a sample of size n = 20 items, then we calculate the degrees of freedom as df = n - 1 = 20 - 1 = 19 and we write the distribution as T ~ t19.

If the population standard deviation is not known, the error bound for a population mean is:

  • 𝐸𝐵𝑀=(𝑡 𝛼/2)(𝑠√n)

  • 𝑡𝜎2tσ2 is the t-score with area to the right equal to 𝛼2α2,

  • use df = n – 1 degrees of freedom, and

  • s = sample standard deviation.

To calculate the confidence interval directly:

  • Press STAT.

  • Arrow over to TESTS.

  • Arrow down to 8:TInterval and press ENTER (or just press 8).

8.3 A Population Proportion

  • To construct a confidence interval for a single unknown population proportion, p, we need a point estimate for p and the margin of error, E.

    • The sample proportion, pˆ, is the best point estimate of the population proportion, p.

    • The margin of r error, E, is the critical value times the standard deviation for the sample p(1 − p) proportion, z∗ (√ (p(1-p)) / n)

      • z∗ represents the critical value from the standard normal distribution for the confidence level desired.

  • Z-score formula: If 𝑃′~𝑁(𝑝 , √𝑝𝑞/𝑛) then the z-score formula is 𝑧=𝑝′−𝑝/√𝑝𝑞/𝑛

P′ follows a normal distribution for proportions:

  • 𝑋/𝑛= 𝑃′~ 𝑁(𝑛𝑝/𝑛 , √𝑛𝑝𝑞/𝑛)

  • The confidence interval has the form (p′EBP, p′ + EBP). EBP is error bound for the proportion.

  • p′ = 𝑥/𝑛

  • p′ = the estimated proportion of successes (p′ is a point estimate for p, the true proportion.)

  • x = the number of successes

  • n = the size of the sample

Calculating the Sample Size n

  • If researchers desire a specific margin of error, then they can use the error-bound formula to calculate the required sample size.

  • The error-bound formula for a population proportion is

    • 𝐸𝐵𝑃 = (𝑧 𝛼/2)(√𝑝′𝑞′/𝑛)

    • Solving for n gives you an equation for the sample size.

    • 𝑛 = (𝑧 𝛼/2)^2 (𝑝′𝑞′) / 𝐸𝐵𝑃^2

Examples

Chapter 8: Confidence Intervals

Introduction

  • Inferential statistics: We use sample data to make generalizations about an unknown population

  • Sample data: help us to make an estimate of a population parameter.

  • Point estimate: a single number computed from a sample and used to estimate a population parameter

    • x¯ is a point estimate for μ

    • p′ is a point estimate for ρ

    • s is a point estimate for σ

  • Confidence interval: an interval estimate for an unknown population parameter. This depends on:

  • Confidence interval form: (point estimate – margin of error, point estimate + margin of error)

  • Empirical rule: Around 68% of values are within 1 standard deviation of the mean. Around 95% of values are within 2 standard deviations of the mean.

  • The margin of error: how many percentages points your results will differ from the real population value

8.1 A Single Population Mean using the Normal Distribution

  • Confidence level: considered the probability that the calculated confidence interval estimate will contain the true population parameter.

  • Alpha level: is the probability that the interval does not contain the unknown population parameter.

  • standard error of the mean: 𝜎 / √n

  • X¯ is normally distributed, that is, X¯~ N(𝜇𝑋 , 𝜎 / √n)

  • Calculating the Confidence Interval

    • Calculate the sample mean 𝑥⎯⎯x¯ from the sample data. Remember, in this section, we already know the population standard deviation σ.

    • Find the z-score that corresponds to the confidence level.

    • Calculate the error-bound EBM.

    • Construct the confidence interval.

    • Write a sentence that interprets the estimate in the context of the situation in the problem. (Explain what the confidence interval means, in the words of the problem.)

  • Finding the z-score for the Stated Confidence Level

    • Each of the tails contains an area equal to 𝛼/2.

    • The z-score that has an area to the right of 𝛼/2 is denoted by 𝑧 𝛼/2.

  • Calculating the Error Bound: EBM = (𝑧 𝛼/2)(𝜎/√n)

  • Confidence level interpretation: "We estimate with ___% confidence that the true population mean (include the context of the problem) is between ___ and ___ (include appropriate units)."

Effect of Changing the Confidence Level

  • Increasing the confidence level increases the error bound, making the confidence interval wider.

  • Decreasing the confidence level decreases the error bound, making the confidence interval narrower.

Effect of Changing the Sample Size

  • Increasing the sample size causes the error bound to decrease, making the confidence interval narrower.

  • Decreasing the sample size causes the error bound to increase, making the confidence interval wider.

Finding the Error Bound

  • From the upper value for the interval, subtract the sample mean,

  • OR, from the upper value for the interval, subtract the lower value. Then divide the difference by two.

Finding the Sample Mean

  • Subtract the error bound from the upper value of the confidence interval,

  • OR, average the upper and lower endpoints of the confidence interval.

8.2 A Single Population Mean using the Student t Distribution

  • Student's t-distribution: a type of probability distribution that is similar to the normal distribution with its bell shape but has heavier tails

  • Standard deviation: a number that is equal to the square root of the variance and measures how far data values are from their mean; notation: s for sample standard deviation and σ for population standard deviation

  • Normal distribution: continuous random variable (RV) with pdf 𝑓(𝑥)=(1 / 𝜎√2𝜋) 𝑒^–(𝑥–𝜇)^2/2𝜎^2, where μ is the mean of the distribution and σ is the standard deviation, notation: X ~ N(μ,σ).

  • Degrees of freedom: the number of objects in a sample that is free to vary

  • df = n - 1: the degrees of freedom for a Student’s t-distribution where n represents the size of the sample

  • The invT command requires two inputs: invT(area to the left, degrees of freedom) The output is the t-score that corresponds to the area we specified.

Properties of the Student's t-Distribution

  • The graph for the Student's t-distribution is similar to the standard normal curve.

  • The mean for the Student's t-distribution is zero and the distribution is symmetric about zero.

  • The Student's t-distribution has more probability in its tails than the standard normal distribution because the spread of the t-distribution is greater than the spread of the standard normal.

  • The exact shape of the Student's t-distribution depends on the degrees of freedom. As the degrees of freedom increase, the graph becomes more like the graph of the standard normal distribution.

  • The underlying population of individual observations is assumed to be normally distributed with an unknown population mean μ and unknown population standard deviation σ.

The notation for the Student's t-distribution (using T as the random variable) is:

  • T ~ tdf where df = n – 1.

  • For example, if we have a sample of size n = 20 items, then we calculate the degrees of freedom as df = n - 1 = 20 - 1 = 19 and we write the distribution as T ~ t19.

If the population standard deviation is not known, the error bound for a population mean is:

  • 𝐸𝐵𝑀=(𝑡 𝛼/2)(𝑠√n)

  • 𝑡𝜎2tσ2 is the t-score with area to the right equal to 𝛼2α2,

  • use df = n – 1 degrees of freedom, and

  • s = sample standard deviation.

To calculate the confidence interval directly:

  • Press STAT.

  • Arrow over to TESTS.

  • Arrow down to 8:TInterval and press ENTER (or just press 8).

8.3 A Population Proportion

  • To construct a confidence interval for a single unknown population proportion, p, we need a point estimate for p and the margin of error, E.

    • The sample proportion, pˆ, is the best point estimate of the population proportion, p.

    • The margin of r error, E, is the critical value times the standard deviation for the sample p(1 − p) proportion, z∗ (√ (p(1-p)) / n)

      • z∗ represents the critical value from the standard normal distribution for the confidence level desired.

  • Z-score formula: If 𝑃′~𝑁(𝑝 , √𝑝𝑞/𝑛) then the z-score formula is 𝑧=𝑝′−𝑝/√𝑝𝑞/𝑛

P′ follows a normal distribution for proportions:

  • 𝑋/𝑛= 𝑃′~ 𝑁(𝑛𝑝/𝑛 , √𝑛𝑝𝑞/𝑛)

  • The confidence interval has the form (p′EBP, p′ + EBP). EBP is error bound for the proportion.

  • p′ = 𝑥/𝑛

  • p′ = the estimated proportion of successes (p′ is a point estimate for p, the true proportion.)

  • x = the number of successes

  • n = the size of the sample

Calculating the Sample Size n

  • If researchers desire a specific margin of error, then they can use the error-bound formula to calculate the required sample size.

  • The error-bound formula for a population proportion is

    • 𝐸𝐵𝑃 = (𝑧 𝛼/2)(√𝑝′𝑞′/𝑛)

    • Solving for n gives you an equation for the sample size.

    • 𝑛 = (𝑧 𝛼/2)^2 (𝑝′𝑞′) / 𝐸𝐵𝑃^2

Examples