Confidence Interval Estimation in Statistics
Basic Statistics for Business Research Methods
Confidence Interval Estimation: One Population
Copyright © 2013 Pearson Education
Goals of the Session
After completing this session, you should be able to:
Distinguish between a point estimate and a confidence interval estimate
Construct and interpret a confidence interval estimate for a single population mean using both the Z and t distributions
Form and interpret a confidence interval estimate for a single population proportion
Create confidence interval estimates for the variance of a normal population
Determine the required sample size to estimate a mean or proportion within a specified margin of error
Contents of the Chapter
Confidence Intervals for the Population Mean, μ
when Population Variance σ² is Known
when Population Variance σ² is Unknown
Confidence Intervals for the Population Proportion, P (large samples)
Confidence interval estimates for the variance of a normal population
Finite population corrections
Sample-size determination
Properties of Point Estimators
An estimator of a population parameter is:
a random variable that depends on sample information
whose value provides an approximation to this unknown parameter
A specific value of that random variable is called an estimate.
Point and Interval Estimates
A point estimate is a single number
A confidence interval provides additional information about variability
Point Estimate
Lower Confidence Limit
Upper Confidence Limit
Width of confidence interval
Point Estimates
We can estimate a Population Parameter with a Sample Statistic (a Point Estimate)
Mean: μ
Proportion: P
Denoted as: x
Unbiasedness
A point estimator is said to be an unbiased estimator of the parameter q if its expected value is equal to that parameter.
Examples:
The sample mean is an unbiased estimator of μ
The sample variance s² is an unbiased estimator of σ²
The sample proportion is an unbiased estimator of P
Bias
Let be an estimator of q.
The bias in is defined as the difference between its mean and q.
The bias of an unbiased estimator is 0.
Most Efficient Estimator
Suppose there are several unbiased estimators of q:
The most efficient estimator or the minimum variance unbiased estimator of q is the unbiased estimator with the smallest variance.
Let and be two unbiased estimators of q, based on the same number of sample observations.
is said to be more efficient than if:
The relative efficiency of with respect to is the ratio of their variances: rac{Var( heta1)}{Var( heta2)}
Confidence Interval Estimation
How much uncertainty is associated with a point estimate of a population parameter?
An interval estimate provides more information about a population characteristic than does a point estimate
Such interval estimates are called confidence interval estimates.
Confidence Interval Estimate
An interval gives a range of values:
Takes into consideration variation in sample statistics from sample to sample
Based on observation from 1 sample
Gives information about closeness to unknown population parameters
Stated in terms of level of confidence
Can never be 100% confident
Confidence Interval and Confidence Level
If P(a < q < b) = 1 - a then the interval from a to b is called a 100(1 - a) ext{ } confidence interval of q.
The quantity 100(1 - a) ext{ is called the confidence level of the interval }
a is between 0 and 1
In repeated samples of the population, the true value of the parameter q would be contained in 100(1 - a) ext{ } of intervals calculated this way.
The confidence interval calculated in this manner is written as a < q < b with 100(1 - a) ext{ } confidence.
Estimation Process
When the population mean μ is unknown:
Population: Random Sample
Mean: X = 50
Sample Confidence: I am 95% confident that μ is between 40 & 60.
Confidence Level (1-a)
Suppose the confidence level = 95%
Also written as: (1 - a) = 0.95
A relative frequency interpretation:
From repeated samples, 95% of all the confidence intervals that can be constructed of size n will contain the unknown true parameter.
A specific interval either will contain or will not contain the true parameter.
No probability is involved in a specific interval.
General Formula
The general formula for all confidence intervals is:
The value of the margin of error depends on the desired level of confidence:
Formula: ext{Point Estimate} ext{ } ext{ Margin of Error}
Confidence Intervals
Population Mean with σ² Unknown
Population Proportion with σ² Known
Population Variance (From normally distributed populations)
Confidence Interval Estimation for the Mean (σ² Known)
Assumptions
Population variance σ² is known
Population is normally distributed
If the population is not normal, use a large sample.
Confidence Interval Estimate:
ext{CI} = ar{X} ext{ } z_{ rac{a}{2}} rac{ ext{s}}{ ext{} ext{n}}$
where z_{ rac{a}{2}} is the normal distribution value for a probability of rac{a}{2} in each tail.
Confidence Limits
The confidence interval consists of the following elements:
The endpoints of the interval are:
Upper confidence limit
Lower confidence limit
Margin of Error
The confidence interval can also be represented as:
Formula: ext{Confidence Interval} = ext{Point Estimate} ext{ } ext{ ME}
The interval width, w, is equal to twice the margin of error:
Formula: w = 2 ext{ ME}
Reducing the Margin of Error
The margin of error can be reduced if:
The population standard deviation can be reduced (σ ↓)
The sample size is increased (n ↑)
The confidence level is decreased, (1 – a) ↓
Finding z_{a/2}
Consider a 95% confidence interval:
z = -1.96
z = 1.96
ext{Point Estimate} < ext{Lower Confidence Limit} < ext{Upper Confidence Limit}
Finding z_{0.025} = ±1.96 from the standard normal distribution table.
Common Levels of Confidence
Commonly used confidence levels are:
90%, 95%, 98%, and 99%
Confidence Level | Confidence Coefficient, za/2 value |
|---|---|
80% | 1.28 |
90% | 1.645 |
95% | 1.96 |
98% | 2.33 |
99% | 2.58 |
99.8% | 3.08 |
99.9% | 3.27 |
Intervals and Level of Confidence
Confidence Intervals extend from
The percentage of intervals constructed that will contain μ is 100(1-a) ext{ } ;
100(a) ext{ } do not.
Example:
A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms.
Population standard deviation is 0.35 ohms.
Problem: Determine a 95% confidence interval for the true mean resistance of the population.
Solution: Continued…
Calculate it using the variables provided.
Interpretation
We are 95% confident that the true mean resistance exists between 1.9932 and 2.4068 ohms.
Although the true mean may or may not be within this interval, 95% of intervals formed in this manner will contain the true mean.
Population Mean with σ² Unknown
Confidence Intervals for Population Proportions with σ² Known
Population Variance
Confidence Interval Estimation for the Mean (σ² Unknown)
Student’s t Distribution
Consider a random sample of n observations with mean x and standard deviation s from a normally distributed population with mean μ.
The variable follows the Student’s t distribution with (n - 1) degrees of freedom.
Characteristics of the Student’s t Distribution
The t is a family of distributions.
The t value depends on degrees of freedom (d.f.).
The number of observations that are free to vary after sample mean has been calculated: d.f. = n - 1
Shape of the t Distribution
t distributions are bell-shaped and symmetric but have ‘fatter’ tails than the normal distribution.
Note: t o Z as n increases.
Student’s t Table
Upper Tail Area | df | .10 | .05 | .025 |
|---|---|---|---|---|
1 | 12.706 | |||
2 | 3.182 | |||
3 | 2.920 |
t Distribution Values
With comparison to the Z value:
Confidence Level
t (10 d.f.)
t (20 d.f.)
t (30 d.f.)
Z
.80
1.372
1.325
1.310
1.282
.90
1.812
1.725
1.697
1.645
.95
2.228
2.086
2.042
1.960
.99
3.169
2.845
2.750
2.576
Using the t Distribution
If the population standard deviation σ is unknown, we can substitute the sample standard deviation, s.
This introduces extra uncertainty since s is variable from sample to sample, thus we use the t distribution instead of the normal distribution.
Confidence Interval Estimation for the Mean (σ² Unknown)
Assumptions
Population standard deviation is unknown
Population is normally distributed
If population is not normal, use a large sample
Use Ratio:
ext{Confidence Interval} = ar{X} ext{ } t_{ rac{n-1}{ rac{α}{2}}} rac{s}{ ext{ } ext{n}}
Margin of Error
The confidence interval can also be expressed as:
ext{ME} = t_{df, rac{α}{2}} rac{s}{ ext{ } ext{n}}
Example
A random sample of n = 25 has x = 50 and s = 8.
Problem: Form a 95% confidence interval for μ.
Population Proportion Confidence Intervals
For Population Proportion
An interval estimate for the population proportion (P) can be calculated by adding an allowance for uncertainty to the sample proportion ().
Confidence Intervals for the Population Proportion
The distribution of the sample proportion is approximately normal if the sample size is large, with standard deviation rac{ ext{P}(1−P)}{n} .
The confidence interval for the population proportion is given by:
Formula: ar{P} ext{ } z_{ rac{a}{2}} rac{ ext{P}(1−P)}{ ext{ } ext{n}} $$
Example
A random sample of 100 people shows that 25 are left-handed.
Problem: Form a 95% confidence interval for the true proportion of left-handers.
Interpretation
We are 95% confident that the true proportion of left-handers in the population is between 16.51% and 33.49%.
Although the interval may or may not contain the true proportion, 95% of intervals formed from samples of size 100 in this manner will contain the true proportion.
Confidence Interval Estimation for the Variance
Population Variance
Confidence intervals for the population variance are based on the sample variance, s²
Assumed: the population is normally distributed
Goal: Form a confidence interval for the population variance, σ².
chi-square Distribution
The random variable follows a chi-square distribution with (n – 1) degrees of freedom
To find the chi-squared values.
Example of Speed Testing
Example:
Testing the speed of computer processors.
Given: Sample size = 17, Sample mean = 3004, Sample std dev = 74.
Determine the 95% confidence interval for σx².
Finding the Chi-Square Values
n = 17 so the chi-square distribution has (n – 1) = 16 degrees of freedom.
a = 0.05, so use the chi-square values with area 0.025 in each tail.
Calculating the Confidence Limits
The 95% confidence interval for the population standard deviation of CPU speed is between 55.1 and 112.6 MHz.
Finite Population Corrections
Estimating Population Mean
If the sample size is more than 5% of the population size (sampling without replacement), then a finite population correction factor must be applied when calculating standard error.
Finite Population Correction Factor
Apply the finite population correction factor when estimating the population variance.
Estimating Population Total
A simple random sample of size n from a population of size N.
Point estimate for total = Nx.
Confidence Interval for Population Total
Example
A firm has a population of 1000 accounts and wishes to estimate the total balance.
Sample of 80 accounts with average balance of $87.60 and standard deviation of $22.30.
Find the 95% confidence interval estimate of the total balance.
Required Sample Size
Sample Size Determination: Large Populations
To determine the required sample size for the mean:
Desired level of confidence (1 - a), which determines the za/2 value
Acceptable margin of error (sampling error), ME
Population standard deviation, σ
Sample Size Example
Required sample size example when s = 45, what size is needed to estimate the mean within ±5 with 90% confidence?
Required sample size is n = 220.
Sample Size Determination: Population Proportion
How to Calculate
Estimate P(1 – P) = 0.25 to produce desired confidence level.
Identify the required sample size for proportions:
Acceptable margin of error (ME)
Example: Sample Size for Proportion
How large a sample would be necessary to estimate the true proportion defective within ±3%, with 95% confidence?
Required Sample Size Solution
Use z_{0.025} = 1.96, ME = 0.03.
Result: Use n = 1068.
Finite Populations Sample Size Determination
A finite population correction factor is added.
Calculate the required sample size n0 first, then adjust accordingly.
Chapter Summary
Introduced the concept of confidence intervals.
Discussed point estimates.
Developed confidence interval estimates for means (σ² known).
Introduced the Student’s t distribution.
Determined confidence interval estimates for means (σ² unknown).
Created confidence interval estimates for proportions.
Created confidence interval estimates for the variance of a normal population.
Applied the finite population correction factor to form confidence intervals when the sample size is not small relative to the population size.
Determined required sample size to meet confidence and margin of error requirements.
Copyright © 2013 Pearson Education
All rights reserved - No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means without permission.
This note covers confidence interval estimation for one population. Key goals include distinguishing point estimates from confidence intervals, constructing and interpreting intervals for population means (using Z and t distributions), proportions, variances, and determining required sample sizes. Confidence intervals provide ranges indicating uncertainty about population parameters, where the confidence level represents the proportion of intervals containing the true parameter across repeated samples. Various confidence levels (80%, 90%, 95%, 98%, 99%, and 99.9%) are discussed alongside common formulas for constructing intervals. It also addresses different estimation methods and assumptions related to population distribution and sample sizes, including finite population corrections and the use of the chi-square distribution for variance. The note culminates in a summary of methods to determine required sample sizes for estimating means and proportions with specified confidence levels.