Chapter 8: Confidence Intervals

Inferential statistics: We use sample data to make generalizations about an unknown population
Sample data: help us to make an estimate of a population parameter.
Point estimate: a single number computed from a sample and used to estimate a population parameter
- x¯ is a point estimate for μ
- p′ is a point estimate for ρ
- s is a point estimate for σ
Confidence interval: an interval estimate for an unknown population parameter. This depends on:
Confidence interval form: (point estimate – margin of error, point estimate + margin of error)
Empirical rule: Around 68% of values are within 1 standard deviation of the mean. Around 95% of values are within 2 standard deviations of the mean.
The margin of error: how many percentages points your results will differ from the real population value

Confidence level: considered the probability that the calculated confidence interval estimate will contain the true population parameter.
Alpha level: is the probability that the interval does not contain the unknown population parameter.
standard error of the mean: 𝜎 / √n
X¯ is normally distributed, that is, X¯~ N(𝜇𝑋 , 𝜎 / √n)
Calculating the Confidence Interval
- Calculate the sample mean 𝑥⎯⎯x¯ from the sample data. Remember, in this section, we already know the population standard deviation σ.
- Find the z-score that corresponds to the confidence level.
- Calculate the error-bound EBM.
- Construct the confidence interval.
- Write a sentence that interprets the estimate in the context of the situation in the problem. (Explain what the confidence interval means, in the words of the problem.)
Finding the z-score for the Stated Confidence Level
- Each of the tails contains an area equal to 𝛼/2.
- The z-score that has an area to the right of 𝛼/2 is denoted by 𝑧 𝛼/2.
Calculating the Error Bound: EBM = (𝑧 𝛼/2)(𝜎/√n)
Confidence level interpretation: "We estimate with ___% confidence that the true population mean (include the context of the problem) is between ___ and ___ (include appropriate units)."

Increasing the confidence level increases the error bound, making the confidence interval wider.
Decreasing the confidence level decreases the error bound, making the confidence interval narrower.

Increasing the sample size causes the error bound to decrease, making the confidence interval narrower.
Decreasing the sample size causes the error bound to increase, making the confidence interval wider.

From the upper value for the interval, subtract the sample mean,
OR, from the upper value for the interval, subtract the lower value. Then divide the difference by two.

Student's t-distribution: a type of probability distribution that is similar to the normal distribution with its bell shape but has heavier tails
Standard deviation: a number that is equal to the square root of the variance and measures how far data values are from their mean; notation: s for sample standard deviation and σ for population standard deviation
Normal distribution: continuous random variable (RV) with pdf 𝑓(𝑥)=(1 / 𝜎√2𝜋) 𝑒^–(𝑥–𝜇)^2/2𝜎^2, where μ is the mean of the distribution and σ is the standard deviation, notation: X ~ N(μ,σ).
Degrees of freedom: the number of objects in a sample that is free to vary
df = n - 1: the degrees of freedom for a Student’s t-distribution where n represents the size of the sample
The invT command requires two inputs: invT(area to the left, degrees of freedom) The output is the t-score that corresponds to the area we specified.

The graph for the Student's t-distribution is similar to the standard normal curve.
The mean for the Student's t-distribution is zero and the distribution is symmetric about zero.
The Student's t-distribution has more probability in its tails than the standard normal distribution because the spread of the t-distribution is greater than the spread of the standard normal.
The exact shape of the Student's t-distribution depends on the degrees of freedom. As the degrees of freedom increase, the graph becomes more like the graph of the standard normal distribution.
The underlying population of individual observations is assumed to be normally distributed with an unknown population mean μ and unknown population standard deviation σ.

T ~ tdf where df = n – 1.
For example, if we have a sample of size n = 20 items, then we calculate the degrees of freedom as df = n - 1 = 20 - 1 = 19 and we write the distribution as T ~ t19.

To construct a confidence interval for a single unknown population proportion, p, we need a point estimate for p and the margin of error, E.
- The sample proportion, pˆ, is the best point estimate of the population proportion, p.
- The margin of r error, E, is the critical value times the standard deviation for the sample p(1 − p) proportion, z∗ (√ (p(1-p)) / n)
- z∗ represents the critical value from the standard normal distribution for the confidence level desired.
Z-score formula: If 𝑃′~𝑁(𝑝 , √𝑝𝑞/𝑛) then the z-score formula is 𝑧=𝑝′−𝑝/√𝑝𝑞/𝑛

𝑋/𝑛= 𝑃′~ 𝑁(𝑛𝑝/𝑛 , √𝑛𝑝𝑞/𝑛)
The confidence interval has the form (p′ – EBP, p′ + EBP). EBP is error bound for the proportion.
p′ = 𝑥/𝑛
p′ = the estimated proportion of successes (p′ is a point estimate for p, the true proportion.)
x = the number of successes
n = the size of the sample

If researchers desire a specific margin of error, then they can use the error-bound formula to calculate the required sample size.
The error-bound formula for a population proportion is
- 𝐸𝐵𝑃 = (𝑧 𝛼/2)(√𝑝′𝑞′/𝑛)
- Solving for n gives you an equation for the sample size.
- 𝑛 = (𝑧 𝛼/2)^2 (𝑝′𝑞′) / 𝐸𝐵𝑃^2