Confidence Interval Estimation in Statistics

Basic Statistics for Business Research Methods

Confidence Interval Estimation: One Population

Copyright © 2013 Pearson Education

Goals of the Session

  • After completing this session, you should be able to:

    • Distinguish between a point estimate and a confidence interval estimate

    • Construct and interpret a confidence interval estimate for a single population mean using both the Z and t distributions

    • Form and interpret a confidence interval estimate for a single population proportion

    • Create confidence interval estimates for the variance of a normal population

    • Determine the required sample size to estimate a mean or proportion within a specified margin of error

Contents of the Chapter

  • Confidence Intervals for the Population Mean, μ

    • when Population Variance σ² is Known

    • when Population Variance σ² is Unknown

  • Confidence Intervals for the Population Proportion, P (large samples)

  • Confidence interval estimates for the variance of a normal population

  • Finite population corrections

  • Sample-size determination

Properties of Point Estimators

  • An estimator of a population parameter is:

    • a random variable that depends on sample information

    • whose value provides an approximation to this unknown parameter

  • A specific value of that random variable is called an estimate.

Point and Interval Estimates

  • A point estimate is a single number

  • A confidence interval provides additional information about variability

    • Point Estimate

    • Lower Confidence Limit

    • Upper Confidence Limit

    • Width of confidence interval

Point Estimates

  • We can estimate a Population Parameter with a Sample Statistic (a Point Estimate)

    • Mean: μ

    • Proportion: P

    • Denoted as: x

Unbiasedness

  • A point estimator is said to be an unbiased estimator of the parameter q if its expected value is equal to that parameter.

    • Examples:

    • The sample mean is an unbiased estimator of μ

    • The sample variance s² is an unbiased estimator of σ²

    • The sample proportion is an unbiased estimator of P

Bias

  • Let  be an estimator of q.

    • The bias in  is defined as the difference between its mean and q.

    • The bias of an unbiased estimator is 0.

Most Efficient Estimator

  • Suppose there are several unbiased estimators of q:

    • The most efficient estimator or the minimum variance unbiased estimator of q is the unbiased estimator with the smallest variance.

    • Let  and  be two unbiased estimators of q, based on the same number of sample observations.

    •  is said to be more efficient than  if:

    • The relative efficiency of  with respect to  is the ratio of their variances: rac{Var( heta1)}{Var( heta2)}

Confidence Interval Estimation

  • How much uncertainty is associated with a point estimate of a population parameter?

    • An interval estimate provides more information about a population characteristic than does a point estimate

    • Such interval estimates are called confidence interval estimates.

Confidence Interval Estimate

  • An interval gives a range of values:

    • Takes into consideration variation in sample statistics from sample to sample

    • Based on observation from 1 sample

    • Gives information about closeness to unknown population parameters

    • Stated in terms of level of confidence

    • Can never be 100% confident

Confidence Interval and Confidence Level

  • If P(a < q < b) = 1 - a then the interval from a to b is called a 100(1 - a) ext{ } confidence interval of q.

    • The quantity 100(1 - a) ext{ is called the confidence level of the interval }

    • a is between 0 and 1

    • In repeated samples of the population, the true value of the parameter q would be contained in 100(1 - a) ext{ } of intervals calculated this way.

    • The confidence interval calculated in this manner is written as a < q < b with 100(1 - a) ext{ } confidence.

Estimation Process

  • When the population mean μ is unknown:

    • Population: Random Sample

    • Mean: X = 50

    • Sample Confidence: I am 95% confident that μ is between 40 & 60.

Confidence Level (1-a)

  • Suppose the confidence level = 95%

  • Also written as: (1 - a) = 0.95

  • A relative frequency interpretation:

    • From repeated samples, 95% of all the confidence intervals that can be constructed of size n will contain the unknown true parameter.

    • A specific interval either will contain or will not contain the true parameter.

    • No probability is involved in a specific interval.

General Formula

  • The general formula for all confidence intervals is:

    • The value of the margin of error depends on the desired level of confidence:

    • Formula: ext{Point Estimate} ext{ } ext{ Margin of Error}

Confidence Intervals

  • Population Mean with σ² Unknown

  • Population Proportion with σ² Known

  • Population Variance (From normally distributed populations)

Confidence Interval Estimation for the Mean (σ² Known)

Assumptions
  • Population variance σ² is known

  • Population is normally distributed

  • If the population is not normal, use a large sample.

Confidence Interval Estimate:
  • ext{CI} = ar{X} ext{ } z_{ rac{a}{2}} rac{ ext{s}}{ ext{} ext{n}}$

    • where z_{ rac{a}{2}} is the normal distribution value for a probability of rac{a}{2} in each tail.

Confidence Limits

  • The confidence interval consists of the following elements:

    • The endpoints of the interval are:

    • Upper confidence limit

    • Lower confidence limit

Margin of Error

  • The confidence interval can also be represented as:

  • Formula: ext{Confidence Interval} = ext{Point Estimate} ext{ } ext{ ME}

    • The interval width, w, is equal to twice the margin of error:

    • Formula: w = 2 ext{ ME}

Reducing the Margin of Error

  • The margin of error can be reduced if:

    • The population standard deviation can be reduced (σ ↓)

    • The sample size is increased (n ↑)

    • The confidence level is decreased, (1 – a) ↓

Finding z_{a/2}

  • Consider a 95% confidence interval:

    • z = -1.96

    • z = 1.96

    • ext{Point Estimate} < ext{Lower Confidence Limit} < ext{Upper Confidence Limit}

    • Finding z_{0.025} = ±1.96 from the standard normal distribution table.

Common Levels of Confidence

  • Commonly used confidence levels are:

    • 90%, 95%, 98%, and 99%

Confidence Level

Confidence Coefficient, za/2 value

80%

1.28

90%

1.645

95%

1.96

98%

2.33

99%

2.58

99.8%

3.08

99.9%

3.27

Intervals and Level of Confidence

  • Confidence Intervals extend from

    • The percentage of intervals constructed that will contain μ is 100(1-a) ext{ } ;

    • 100(a) ext{ } do not.

Example:

  • A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms.

  • Population standard deviation is 0.35 ohms.

  • Problem: Determine a 95% confidence interval for the true mean resistance of the population.

Solution: Continued…

  • Calculate it using the variables provided.

Interpretation

  • We are 95% confident that the true mean resistance exists between 1.9932 and 2.4068 ohms.

    • Although the true mean may or may not be within this interval, 95% of intervals formed in this manner will contain the true mean.

Population Mean with σ² Unknown

  • Confidence Intervals for Population Proportions with σ² Known

  • Population Variance

Confidence Interval Estimation for the Mean (σ² Unknown)

Student’s t Distribution
  • Consider a random sample of n observations with mean x and standard deviation s from a normally distributed population with mean μ.

    • The variable follows the Student’s t distribution with (n - 1) degrees of freedom.

Characteristics of the Student’s t Distribution

  • The t is a family of distributions.

  • The t value depends on degrees of freedom (d.f.).

  • The number of observations that are free to vary after sample mean has been calculated: d.f. = n - 1

Shape of the t Distribution

  • t distributions are bell-shaped and symmetric but have ‘fatter’ tails than the normal distribution.

    • Note: t o Z as n increases.

Student’s t Table

Upper Tail Area

df

.10

.05

.025

1

12.706

2

3.182

3

2.920

t Distribution Values


  • With comparison to the Z value:

    Confidence Level

    t (10 d.f.)

    t (20 d.f.)

    t (30 d.f.)

    Z


    .80

    1.372

    1.325

    1.310

    1.282


    .90

    1.812

    1.725

    1.697

    1.645


    .95

    2.228

    2.086

    2.042

    1.960


    .99

    3.169

    2.845

    2.750

    2.576

    Using the t Distribution

    • If the population standard deviation σ is unknown, we can substitute the sample standard deviation, s.

      • This introduces extra uncertainty since s is variable from sample to sample, thus we use the t distribution instead of the normal distribution.

    Confidence Interval Estimation for the Mean (σ² Unknown)

    Assumptions
    • Population standard deviation is unknown

    • Population is normally distributed

      • If population is not normal, use a large sample

      • Use Ratio:
        ext{Confidence Interval} = ar{X} ext{ } t_{ rac{n-1}{ rac{α}{2}}} rac{s}{ ext{ } ext{n}}

    Margin of Error

    • The confidence interval can also be expressed as:

    • ext{ME} = t_{df, rac{α}{2}} rac{s}{ ext{ } ext{n}}

    Example

    • A random sample of n = 25 has x = 50 and s = 8.

    • Problem: Form a 95% confidence interval for μ.

    Population Proportion Confidence Intervals

    For Population Proportion
    • An interval estimate for the population proportion (P) can be calculated by adding an allowance for uncertainty to the sample proportion ().

    Confidence Intervals for the Population Proportion

    • The distribution of the sample proportion is approximately normal if the sample size is large, with standard deviation rac{ ext{P}(1−P)}{n} .

    • The confidence interval for the population proportion is given by:

      • Formula: ar{P} ext{ } z_{ rac{a}{2}} rac{ ext{P}(1−P)}{ ext{ } ext{n}} $$

    Example

    • A random sample of 100 people shows that 25 are left-handed.

    • Problem: Form a 95% confidence interval for the true proportion of left-handers.

    Interpretation

    • We are 95% confident that the true proportion of left-handers in the population is between 16.51% and 33.49%.

      • Although the interval may or may not contain the true proportion, 95% of intervals formed from samples of size 100 in this manner will contain the true proportion.

    Confidence Interval Estimation for the Variance

    Population Variance
    • Confidence intervals for the population variance are based on the sample variance, s²

    • Assumed: the population is normally distributed

    • Goal: Form a confidence interval for the population variance, σ².

    chi-square Distribution

    • The random variable follows a chi-square distribution with (n – 1) degrees of freedom

      • To find the chi-squared values.

    Example of Speed Testing

    Example:
    • Testing the speed of computer processors.

    • Given: Sample size = 17, Sample mean = 3004, Sample std dev = 74.

    • Determine the 95% confidence interval for σx².

    Finding the Chi-Square Values

    • n = 17 so the chi-square distribution has (n – 1) = 16 degrees of freedom.

    • a = 0.05, so use the chi-square values with area 0.025 in each tail.

    Calculating the Confidence Limits

    • The 95% confidence interval for the population standard deviation of CPU speed is between 55.1 and 112.6 MHz.

    Finite Population Corrections

    Estimating Population Mean
    • If the sample size is more than 5% of the population size (sampling without replacement), then a finite population correction factor must be applied when calculating standard error.

    Finite Population Correction Factor

    • Apply the finite population correction factor when estimating the population variance.

    Estimating Population Total

    • A simple random sample of size n from a population of size N.

    • Point estimate for total = Nx.

    Confidence Interval for Population Total

    Example
    • A firm has a population of 1000 accounts and wishes to estimate the total balance.

    • Sample of 80 accounts with average balance of $87.60 and standard deviation of $22.30.

    • Find the 95% confidence interval estimate of the total balance.

    Required Sample Size

    Sample Size Determination: Large Populations
    • To determine the required sample size for the mean:

      • Desired level of confidence (1 - a), which determines the za/2 value

      • Acceptable margin of error (sampling error), ME

      • Population standard deviation, σ

    Sample Size Example

    • Required sample size example when s = 45, what size is needed to estimate the mean within ±5 with 90% confidence?

    • Required sample size is n = 220.

    Sample Size Determination: Population Proportion

    How to Calculate
    • Estimate P(1 – P) = 0.25 to produce desired confidence level.

    • Identify the required sample size for proportions:

      • Acceptable margin of error (ME)

    Example: Sample Size for Proportion

    • How large a sample would be necessary to estimate the true proportion defective within ±3%, with 95% confidence?

    Required Sample Size Solution

    • Use z_{0.025} = 1.96, ME = 0.03.

    • Result: Use n = 1068.

    Finite Populations Sample Size Determination

    • A finite population correction factor is added.

      • Calculate the required sample size n0 first, then adjust accordingly.

    Chapter Summary

    • Introduced the concept of confidence intervals.

    • Discussed point estimates.

    • Developed confidence interval estimates for means (σ² known).

    • Introduced the Student’s t distribution.

    • Determined confidence interval estimates for means (σ² unknown).

    • Created confidence interval estimates for proportions.

    • Created confidence interval estimates for the variance of a normal population.

    • Applied the finite population correction factor to form confidence intervals when the sample size is not small relative to the population size.

    • Determined required sample size to meet confidence and margin of error requirements.

    Copyright © 2013 Pearson Education

    All rights reserved - No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means without permission.

This note covers confidence interval estimation for one population. Key goals include distinguishing point estimates from confidence intervals, constructing and interpreting intervals for population means (using Z and t distributions), proportions, variances, and determining required sample sizes. Confidence intervals provide ranges indicating uncertainty about population parameters, where the confidence level represents the proportion of intervals containing the true parameter across repeated samples. Various confidence levels (80%, 90%, 95%, 98%, 99%, and 99.9%) are discussed alongside common formulas for constructing intervals. It also addresses different estimation methods and assumptions related to population distribution and sample sizes, including finite population corrections and the use of the chi-square distribution for variance. The note culminates in a summary of methods to determine required sample sizes for estimating means and proportions with specified confidence levels.