Confidence Intervals

Topic 6: Confidence Intervals

Learning Outcomes

  • Verify conditions for using sampling distributions for inference.

  • Compute critical values (z or t) using Excel.

  • Compute confidence intervals for averages.

  • Compute confidence intervals for proportions.

  • Interpret confidence intervals clearly.

  • Understand the impact of sample size (n) on certainty, confidence level, and precision (margin of error).

  • (In class) Invert margin of error calculation to find required sample size.

Review of Topic 5

  • Developed sampling distributions of the sample mean or sample proportion.

  • Assumed knowledge of the population parameter.

  • Calculated probability statements about sample statistics.

  • Example: Knew population average and standard deviation; with sample size n \geq 30, sampling distribution is normal, allowing calculation of probability statements about the sample mean.

Topic 6: Estimating Unknown Population Parameters

  • More realistic scenario: Population average is unknown and needs to be estimated.

  • Inference about a parameter is made using a statistic and knowledge of the sampling distribution.

  • Example: Weight of cheeseburgers from various Burger Kings in Switzerland.

    • Sample of 30 cheeseburgers.

    • Average weight: 47 grams (\bar{x} = 47 grams).

    • Question: What can be inferred about the weight of burgers in general?

Point Estimate

  • Estimates an unknown population parameter using a single value.

  • In the cheeseburger example, the point estimate of the average weight is 47 grams (\hat{\mu} = 47 grams).

  • The sample mean is a continuous random variable and can vary.

  • The sampling distribution is normal because the sample size is at least 30.

  • Assume population standard deviation \sigma = 5.5 grams.

  • Standard error: \frac{5.5}{\sqrt{30}} \approx 1 gram.

  • With another sample, might get a sample mean of 49 grams or 46 grams.

Disadvantages of Point Estimate
  1. Almost certainly wrong: The probability that a continuous random variable equals a specific value is zero.

  2. Does not indicate proximity: Doesn't show how close the estimate is to the true parameter. Values like 47, 46, or 49 grams don't indicate closeness to the true mean.

  3. Doesn't reflect sample size: Increasing the sample size should yield more accurate results, but the point estimate does not reflect this effect.

Interval Estimate

  • Estimates an unknown parameter using an interval.

  • Interval is defined by lower and upper limits, calculated by adding and subtracting a margin of error from the point estimate.

  • Example: If \bar{x} = 47 grams, the interval is 47 \pm \text{some value}.

  • The "some value" that is added/subtracted is the margin of error.

  • The margin of error is expressed as the number of standard errors from the point estimate.

  • If standard error = 1 gram, adding/subtracting 2 grams corresponds to adding/subtracting 2 standard errors.

Determining the Number of Standard Errors
  • Aim for a certain level of confidence (e.g., a %).

  • An interval with 100% certainty would be infinite (from negative infinity to infinity), which is not useful.

  • For 95% certainty:

    • In a normal distribution, approximately -2 and 2 standard deviations from the mean capture 95% of the data.

    • Margin of error corresponds to two standard errors.

    • Interval limits: point estimate \pm two standard errors.

Formula for Confidence Interval (Sigma Known)

  • Assume we can use the normal distribution to compute the interval estimate of \mu, and sigma is known.

  • Interval limits are calculated as:

    • Upper Limit: Sample average + z standard errors (\bar{x} + z \cdot \text{SE}, where \text{SE} = \frac{\sigma}{\sqrt{n}}).

    • Lower Limit: Sample average - z standard errors (\bar{x} - z \cdot \text{SE}, where \text{SE} = \frac{\sigma}{\sqrt{n}}).

  • z depends on the degree of confidence.

  • Excel function: CONFIDENCE.NORM can be used to find the error margin.

Interpretation
  • Each sample average yields a different interval estimate.

  • While the sample mean doesn't tell us where mu is, the interval limits surrounding the sample mean captures mu.

  • With 95% confidence, 95 out of 100 intervals will contain mu between the lower and upper limits.

  • This method provides a good estimate 95% of the time, much better than a point estimator (0% probability of being exactly correct).

  • We will primarily use interval estimates going forward.

Excel Example

  • Using the "Confidence Interval" Excel file; example 1a.

  • Data collected on amounts spent by 64 customers for lunch at a restaurant in Geneva.

  • Variables: Amount spent and satisfaction.

  • Focusing on the "amount spent" variable.

  • Sample size: 64 (n = 64).

  • Average amount spent: 73.88 Swiss francs.

  • Goal: Build an interval estimate from this point estimate.

  • Population standard deviation: \sigma = 14 Swiss francs (from past studies).

  • Confidence level: 95%.

  • Task: Find the margin of error.

Conditions for Validity
  1. Population standard deviation sigma is known (use normal distribution).

  2. Sample size is n=64 > 30 (CLT applies; sampling distribution is normal).

Sample Description
  • Sample size (n): 64

  • Population standard deviation (\sigma): 14 CHF

  • Sample mean (\bar{x}): 73.9 CHF (point estimate of the amount spent by customers in general)

Interval Construction
  • Build an interval to add and subtract from 73.9 to determine the margin of error.

Degree of Confidence
  • Set at 95%.

  • Most commonly used confidence level.

Significance Level
  • Complement of the confidence level.

  • SL = 1 - \text{confidence level}.

  • If confidence level is 95%, the significance level is 5% (risk of being wrong).

Z Value
  • Using properties of the standard normal distribution because the confidence interval is centered around the point estimate.

  • Find z such that z and negative z are equidistant from the mean, and the probability within the interval is 95%.

  • Calculate the upper limit.

  • NORM.S.INV(probability)

  • Probability = 95% (center) + 2.5% (left tail) = 97.5%.

  • z = NORM.S.INV(0.975) = 1.96

  • To build our interval, the margin of error will include 1.96 standard errors.

Standard Error
  • Standard Error (SE) is the population standard deviation divided by the square root of the sample size.

  • SE = \frac{\sigma}{\sqrt{n}} = \frac{14}{\sqrt{64}} = 1.75

Margin of Error
  • Margin of Error (EM) is the standard error * value of z.

  • EM = 1.75 \cdot 1.96 = 3.43 CHF

Precision
  • Expresses the margin of error as a proportion of the point estimate.

  • Precision = EM / \bar{x} = 3.4 / 73.9 \approx 4.6 %

  • The margin of error is 3.4 CHF, or 4.6% of 73.9 CHF.

Excel Function
  • Use CONFIDENCE.NORM to find the margin of error directly.

  • CONFIDENCE.NORM(alpha, standard_deviation, sample_size)

  • Alpha (significance level) = 5% (1 - confidence level).

  • Standard deviation = sigma.

  • Sample size = n.

  • CONFIDENCE.NORM(0.05, 14, 64) = 3.43.

Confidence Interval Estimate
  • Express the findings.

  • The average spending is 73.9 CHF with a margin of error of 3.4 CHF at a 95% confidence level.

  • The average spending is 73.9 CHF with a precision of 4.6% at a 95% confidence level.

Calculating Limits
  • Lower Limit = point estimate - EM = 73.9 - 3.4 = 70.4 CHF

  • Upper Limit = point estimate + EM = 73.9 + 3.4 = 77.3 CHF

Conclusion
  • With a 95% confidence level, we can say that the average spending lies between 70.4 CHF and 77.3 CHF.

  • Always include the point estimate, margin of error, and confidence level.

  • Or include the lower and upper limits with the confidence level.

Future Topics

  • What happens if \sigma (sigma) is unknown?

  • Introduce the t-distribution.

  • Discuss proportions as well.