Statistics - Confidence Intervals
Confidence Interval Overview
- Confidence intervals provide a range of plausible values for an unknown population parameter, on which we place a high degree of confidence that it includes the true value.
- They expand point estimates to interval estimates, offering a range for our estimates of a parameter, typically the mean (BC).
Confidence Interval for the Population Mean
Large Sample Size (n >= 30)
- Central Limit Theorem: For large sample sizes, the sampling distribution of the sample mean ($ar{x}$) is approximately normal.
- The formula for a 95% confidence interval for the mean is:
ar{x} ext{ } ext{±} Z_{0.025} rac{s}{ ext{√}n}, where:
- Z_{0.025} is the critical value from the z-distribution (1.96 for 95% confidence).
- s is the sample standard deviation.
Example Calculation
- Sample of 80 shops:
- Mean cost of repair ($ar{x}$) = $472.36
- Standard deviation ($s$) = $62.35
- Confidence interval:
472.36 ext{ ± } 1.96 rac{62.35}{ ext{√}80}
ightarrow [458.7, 486.02] - Interpretation: We are 95% confident that the true mean cost is within this interval.
Understanding Confidence
- Misinterpretation: It is incorrect to say there's a 95% probability that the true mean falls within the interval. The parameter (BC) is fixed, not random.
- Randomness applies to the sampling process that generates different intervals.
Calculating Confidence Levels
- For intervals derived from independent samples, about 95% will contain the true mean ($BC$).
- For confidence levels:
- 90% confidence: Critical value = 1.645
- 99% confidence: Critical value = 2.575
Precision in Confidence Intervals
- Factors affecting interval width:
- Sample Standard Deviation (s): Larger values widen the interval.
- Confidence Level: Higher confidence levels widen the interval (Z alpha increases).
- Sample Size (n): Larger sample sizes narrow the interval. Width is inversely proportional to $ ext{√}n$.
Sample Size Determination
- To achieve a specific maximum error ($W$) in a 95% confidence interval:
n = rac{(Z_{0.025} s)}{W}^2
Smaller Sample Size (n < 30)
- If the sample size is small and the population distribution is normal:
- Use the t-distribution instead of the normal distribution:
- The t-distribution is similar but has heavier tails, accommodating more variability.
- Confidence interval formula adjusts:
ar{x} ext{ } ext{±} t_{n-1} rac{s}{ ext{√}n} - Where t_{n-1} is the critical t-value with (n-1) degrees of freedom.
Example with t-Distribution
- Sample mean ($ar{x}$) = 61,492, sample standard deviation (s) = 3,035, sample size (n) = 10:
- Degrees of freedom (df) = 9.
- Critical value for 95% confidence ($t_{0.025}$, 9 df) = 2.262.
- Confidence interval calculation:
61,492 ext{ } ext{±} 2.262 rac{3,035}{ ext{√}10}
ightarrow [59,321.04, 63,662.96]
One-Sided Confidence Intervals
- For one-sided bounds, replace Z{0.025} with Z{ ext{α}} in:
- Lower confidence bound: ar{x} - Z_{ ext{α}} rac{s}{ ext{√}n}
- Upper confidence bound: ar{x} + Z_{ ext{α}} rac{s}{ ext{√}n}
Confidence Interval for Population Proportion
Large Samples
- Proportions can also be estimated using a similar formula:
- 95% confidence interval for proportion ($ ext{p}$):
ext{p̂} ext{±} 1.96 ext{√}rac{ ext{p̂}(1- ext{p̂})}{n}
- 95% confidence interval for proportion ($ ext{p}$):
Adjusted Confidence Interval
- If sample size n is small, use:
- Adjusted sample size: n + 4 (add 2 successes and 2 failures).
- Adjusted sample proportion: ext{p̄} = (x+2)/n^ ext{adjusted}
Example for Proportion
For a sample of 10 cracked tiles out of 125, adjusted sample proportion:
- Adjusted sample size = 129, adjusted proportion = 0.114
- 99% confidence interval:
0.114 ext{±} 2.575 ext{√}rac{0.114(1-0.114)}{129}
Summary Conclusion
- Understanding confidence intervals is crucial for making inferential statistics about population parameters.
- Different approaches apply based on sample size and distribution characteristics.
- Common pitfalls in interpretation should always be avoided for accurate statistical reporting.