Confidence Intervals

Topic 6: Confidence Intervals

Learning Outcomes

Verify conditions for using sampling distributions for inference.
Compute critical values (z or t) using Excel.
Compute confidence intervals for averages.
Compute confidence intervals for proportions.
Interpret confidence intervals clearly.
Understand the impact of sample size ( $n$ ) on certainty, confidence level, and precision (margin of error).
(In class) Invert margin of error calculation to find required sample size.

Review of Topic 5

Developed sampling distributions of the sample mean or sample proportion.
Assumed knowledge of the population parameter.
Calculated probability statements about sample statistics.
Example: Knew population average and standard deviation; with sample size $n \geq 30$ , sampling distribution is normal, allowing calculation of probability statements about the sample mean.

Topic 6: Estimating Unknown Population Parameters

More realistic scenario: Population average is unknown and needs to be estimated.
Inference about a parameter is made using a statistic and knowledge of the sampling distribution.
Example: Weight of cheeseburgers from various Burger Kings in Switzerland.
- Sample of 30 cheeseburgers.
- Average weight: 47 grams ( $\bar{x} = 47$ grams).
- Question: What can be inferred about the weight of burgers in general?

Point Estimate

Estimates an unknown population parameter using a single value.
In the cheeseburger example, the point estimate of the average weight is 47 grams ( $\hat{\mu} = 47$ grams).
The sample mean is a continuous random variable and can vary.
The sampling distribution is normal because the sample size is at least 30.
Assume population standard deviation $\sigma = 5.5$ grams.
Standard error: $\frac{5.5}{\sqrt{30}} \approx 1$ gram.
With another sample, might get a sample mean of 49 grams or 46 grams.

Disadvantages of Point Estimate

Almost certainly wrong: The probability that a continuous random variable equals a specific value is zero.
Does not indicate proximity: Doesn't show how close the estimate is to the true parameter. Values like 47, 46, or 49 grams don't indicate closeness to the true mean.
Doesn't reflect sample size: Increasing the sample size should yield more accurate results, but the point estimate does not reflect this effect.

Interval Estimate

Estimates an unknown parameter using an interval.
Interval is defined by lower and upper limits, calculated by adding and subtracting a margin of error from the point estimate.
Example: If $\bar{x} = 47$ grams, the interval is $47 \pm \text{some value}$ .
The "some value" that is added/subtracted is the margin of error.
The margin of error is expressed as the number of standard errors from the point estimate.
If standard error = 1 gram, adding/subtracting 2 grams corresponds to adding/subtracting 2 standard errors.

Determining the Number of Standard Errors

Aim for a certain level of confidence (e.g., a %).
An interval with 100% certainty would be infinite (from negative infinity to infinity), which is not useful.
For 95% certainty:
- In a normal distribution, approximately -2 and 2 standard deviations from the mean capture 95% of the data.
- Margin of error corresponds to two standard errors.
- Interval limits: point estimate $\pm$ two standard errors.

Formula for Confidence Interval (Sigma Known)

Assume we can use the normal distribution to compute the interval estimate of $\mu$ , and sigma is known.
Interval limits are calculated as:
- Upper Limit: Sample average + z standard errors ( $\bar{x} + z \cdot \text{SE}$ , where $\text{SE} = \frac{\sigma}{\sqrt{n}}$ ).
- Lower Limit: Sample average - z standard errors ( $\bar{x} - z \cdot \text{SE}$ , where $\text{SE} = \frac{\sigma}{\sqrt{n}}$ ).
$z$ depends on the degree of confidence.
Excel function: CONFIDENCE.NORM can be used to find the error margin.

Interpretation

Each sample average yields a different interval estimate.
While the sample mean doesn't tell us where mu is, the interval limits surrounding the sample mean captures mu.
With 95% confidence, 95 out of 100 intervals will contain mu between the lower and upper limits.
This method provides a good estimate 95% of the time, much better than a point estimator (0% probability of being exactly correct).
We will primarily use interval estimates going forward.

Excel Example

Using the "Confidence Interval" Excel file; example 1a.
Data collected on amounts spent by 64 customers for lunch at a restaurant in Geneva.
Variables: Amount spent and satisfaction.
Focusing on the "amount spent" variable.
Sample size: 64 ( $n = 64$ ).
Average amount spent: 73.88 Swiss francs.
Goal: Build an interval estimate from this point estimate.
Population standard deviation: $\sigma = 14$ Swiss francs (from past studies).
Confidence level: 95%.
Task: Find the margin of error.

Conditions for Validity

Population standard deviation sigma is known (use normal distribution).
Sample size is n=64 > 30 (CLT applies; sampling distribution is normal).

Sample Description

Sample size ( $n$ ): 64
Population standard deviation ( $\sigma$ ): 14 CHF
Sample mean ( $\bar{x}$ ): 73.9 CHF (point estimate of the amount spent by customers in general)

Interval Construction

Build an interval to add and subtract from 73.9 to determine the margin of error.

Degree of Confidence

Set at 95%.
Most commonly used confidence level.

Significance Level

Complement of the confidence level.
$SL = 1 - \text{confidence level}$ .
If confidence level is 95%, the significance level is 5% (risk of being wrong).

Z Value

Using properties of the standard normal distribution because the confidence interval is centered around the point estimate.
Find z such that z and negative z are equidistant from the mean, and the probability within the interval is 95%.
Calculate the upper limit.
NORM.S.INV(probability)
Probability = 95% (center) + 2.5% (left tail) = 97.5%.
$z = NORM.S.INV(0.975) = 1.96$
To build our interval, the margin of error will include 1.96 standard errors.

Standard Error

Standard Error ( $SE$ ) is the population standard deviation divided by the square root of the sample size.
$SE = \frac{\sigma}{\sqrt{n}} = \frac{14}{\sqrt{64}} = 1.75$

Margin of Error

Margin of Error ( $EM$ ) is the standard error * value of z.
$EM = 1.75 \cdot 1.96 = 3.43$ CHF

Precision

Expresses the margin of error as a proportion of the point estimate.
Precision = $EM / \bar{x} = 3.4 / 73.9 \approx 4.6 %$
The margin of error is 3.4 CHF, or 4.6% of 73.9 CHF.

Excel Function

Use CONFIDENCE.NORM to find the margin of error directly.
CONFIDENCE.NORM(alpha, standard_deviation, sample_size)
Alpha (significance level) = 5% (1 - confidence level).
Standard deviation = sigma.
Sample size = n.
CONFIDENCE.NORM(0.05, 14, 64) = 3.43.

Confidence Interval Estimate

Express the findings.
The average spending is 73.9 CHF with a margin of error of 3.4 CHF at a 95% confidence level.
The average spending is 73.9 CHF with a precision of 4.6% at a 95% confidence level.

Calculating Limits

Lower Limit = point estimate - EM = $73.9 - 3.4 = 70.4$ CHF
Upper Limit = point estimate + EM = $73.9 + 3.4 = 77.3$ CHF

Conclusion

With a 95% confidence level, we can say that the average spending lies between 70.4 CHF and 77.3 CHF.
Always include the point estimate, margin of error, and confidence level.
Or include the lower and upper limits with the confidence level.

Future Topics

What happens if $\sigma$ (sigma) is unknown?
Introduce the t-distribution.
Discuss proportions as well.