Statistics - Confidence Intervals

Confidence Interval Overview

Confidence intervals provide a range of plausible values for an unknown population parameter, on which we place a high degree of confidence that it includes the true value.
They expand point estimates to interval estimates, offering a range for our estimates of a parameter, typically the mean (BC).

Confidence Interval for the Population Mean

Large Sample Size (n >= 30)

Central Limit Theorem: For large sample sizes, the sampling distribution of the sample mean ($ar{x}$) is approximately normal.
The formula for a 95% confidence interval for the mean is: $\bar{x} ext{ } ext{±} Z_{0.025} \frac{s}{ ext{√}n}$ , where:
- $Z_{0.025}$ is the critical value from the z-distribution (1.96 for 95% confidence).
- $s$ is the sample standard deviation.

Example Calculation

Sample of 80 shops:
- Mean cost of repair ($ar{x}$) = $472.36
- Standard deviation ($s$) = $62.35
Confidence interval:
472.36 ext{ ± } 1.96 rac{62.35}{ ext{√}80}
ightarrow [458.7, 486.02]
Interpretation: We are 95% confident that the true mean cost is within this interval.

Understanding Confidence

Misinterpretation: It is incorrect to say there's a 95% probability that the true mean falls within the interval. The parameter (BC) is fixed, not random.
Randomness applies to the sampling process that generates different intervals.

Calculating Confidence Levels

For intervals derived from independent samples, about 95% will contain the true mean ($BC$).
For confidence levels:
- 90% confidence: Critical value = 1.645
- 99% confidence: Critical value = 2.575

Precision in Confidence Intervals

Factors affecting interval width:
1. Sample Standard Deviation (s): Larger values widen the interval.
2. Confidence Level: Higher confidence levels widen the interval (Z alpha increases).
3. Sample Size (n): Larger sample sizes narrow the interval. Width is inversely proportional to $ ext{√}n$.

Sample Size Determination

To achieve a specific maximum error ($W$) in a 95% confidence interval:
$n = \frac{(Z_{0.025} s)}{W}^2$

Smaller Sample Size (n < 30)

If the sample size is small and the population distribution is normal:
Use the t-distribution instead of the normal distribution:
- The t-distribution is similar but has heavier tails, accommodating more variability.
Confidence interval formula adjusts:
$\bar{x} ext{ } ext{±} t_{n-1} \frac{s}{ ext{√}n}$
Where $t_{n-1}$ is the critical t-value with (n-1) degrees of freedom.

Example with t-Distribution

Sample mean ($ar{x}$) = 61,492, sample standard deviation (s) = 3,035, sample size (n) = 10:
Degrees of freedom (df) = 9.
Critical value for 95% confidence ($t_{0.025}$, 9 df) = 2.262.
Confidence interval calculation:
61,492 ext{ } ext{±} 2.262 rac{3,035}{ ext{√}10}
ightarrow [59,321.04, 63,662.96]

One-Sided Confidence Intervals

For one-sided bounds, replace $Z<em>{0.025}$ with $Z</em>{ ext{α}}$ in:
- Lower confidence bound: $\bar{x} - Z_{ ext{α}} \frac{s}{ ext{√}n}$
- Upper confidence bound: $\bar{x} + Z_{ ext{α}} \frac{s}{ ext{√}n}$

Confidence Interval for Population Proportion

Large Samples

Proportions can also be estimated using a similar formula:
- 95% confidence interval for proportion ($ ext{p}$):
  $ext{p̂} ext{±} 1.96 ext{√}\frac{ ext{p̂}(1- ext{p̂})}{n}$

Adjusted Confidence Interval

If sample size n is small, use:
- Adjusted sample size: $n + 4$ (add 2 successes and 2 failures).
- Adjusted sample proportion: $ext{p̄} = (x+2)/n^ ext{adjusted}$

Example for Proportion

For a sample of 10 cracked tiles out of 125, adjusted sample proportion:

Adjusted sample size = 129, adjusted proportion = 0.114
99% confidence interval:
$0.114 ext{±} 2.575 ext{√}\frac{0.114(1-0.114)}{129}$

Summary Conclusion

Understanding confidence intervals is crucial for making inferential statistics about population parameters.
Different approaches apply based on sample size and distribution characteristics.
Common pitfalls in interpretation should always be avoided for accurate statistical reporting.