Estimation of the Mean and Proportion

Inferential Statistics

  • Making statements about a population by examining sample results.
  • Uses sample statistics to estimate population parameters.
    • Sample: Known.
    • Population: Unknown, but can be estimated from sample evidence.

Estimation vs. Hypothesis Testing

  • Estimation: Estimate the population mean weight using the sample mean weight.
  • Hypothesis Testing: Use sample evidence to test the claim that the population mean weight is 120 pounds.
  • Drawing conclusions and/or making decisions concerning a population based on sample results.

Estimation

  • The assignment of value(s) to a population parameter based on a value of the corresponding sample statistics.
  • Estimator: A random variable that depends on sample data.
  • Estimate: A specific value of that random variable.

Point and Interval Estimates

  • Point Estimate: A single number calculated from a sample data.
  • Interval Estimate (Confidence Interval): An interval that is constructed around the point estimate, and it is stated that this interval is likely to contain the corresponding population parameter. A confidence interval provides additional information about variability.
    • Includes a lower confidence limit and an upper confidence limit.
    • Width of the confidence interval indicates variability.

Point Estimators and Point Estimates

  • Estimating Population Parameters with Sample Statistics:
    • Mean: Sample mean (\bar{x}) estimates population mean \mu.
    • Proportion: Sample proportion \hat{p} estimates population proportion p.

Estimation Procedure

  1. Select a sample.
  2. Collect the required information from the elements of the sample.
  3. Calculate the value of the sample statistics.
  4. Assign value(s) to the corresponding population parameter.

Point Estimation: Properties

  • Sampling Distribution of the Sample Mean includes:
    • Standard Error of the Mean:
      \sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

Biased and Unbiased Estimators

  • Unbiased Estimator: When the expected value of a sample statistics is equal to the value of the corresponding population parameter.
  • If \bar{X} is a point estimator of a mean, the sampling distribution of the mean has the same mean as the population from which the sample is obtained.
  • We expect that the means of repeated random samples from a given population will be centered on the mean of this population, or \bar{X} is an unbiased estimator of \mu.

Confidence Intervals

  • 95% Confidence Intervals: P(-z \le Z \le z) = 0.95
  • For m (σ known):
    • Consider a standard normal random variable Z.
    • P\left(X - 1.96 \frac{\sigma}{\sqrt{n}} \le \mu \le X + 1.96 \frac{\sigma}{\sqrt{n}}\right) = 0.95
    • X \pm 1.96 \frac{\sigma}{\sqrt{n}}

Estimate the Error of a Population Mean (σ known)

  • Margin of error (maximum error) of estimate for \mu is E = z_{\alpha/2} \frac{\sigma}{\sqrt{n}}.
  • We can assert with a probability (1 - \alpha) that |\bar{X} - \mu| \le z_{\alpha/2} \frac{\sigma}{\sqrt{n}}.

Interval Estimation of a Population Mean (σ known)

  • The general form for all confidence intervals: Point Estimate \pm Margin of Error
  • Confidence interval for \mu: \bar{X} - E \le \mu \le \bar{X} + E
  • Confidence interval for \mu, when σ is known: \bar{x} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}
  • Each interval is constructed with regard to a given confidence level and is called a confidence interval.
  • The confidence level is denoted by (1 - \alpha)100\%.
  • \alpha – significance level (probability of error).

Finding z_{\alpha/2}

  • Consider a 95% confidence interval:
    • 1 - \alpha = .95
    • \alpha / 2 = .025
  • Find z_{.025} = \pm 1.96 from the standard normal distribution table.

Constructing a Confidence Interval

  • Common values for confidence levels:
    • 90%, \alpha = 0.10, \alpha / 2 = 0.05, z{\alpha / 2} = z{.05} = 1.645
    • 95%, \alpha = 0.05, \alpha / 2 = 0.025, z{\alpha / 2} = z{.025} = 1.96
    • 99%, \alpha = 0.01, \alpha / 2 = 0.005, z{\alpha / 2} = z{.005} = 2.575

Intervals and Level of Confidence

  • 100(1-\alpha)\% of intervals constructed contain \mu; 100(\alpha)\% do not.
  • \bar{x} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}

Interpreting a Confidence Interval

  • Technically, If a large number of samples of size n are drawn from a given population, then (1 - \alpha)100\% of the intervals will contain \mu.
  • Informally, we can report with (1 - \alpha)100\% confidence that \mu lies in the given intervals.

Confidence Level and the Width of Confidence Interval

  • Width of a Confidence Interval = 2(Margin of Error) = 2(z_{\alpha/2} \frac{\sigma}{\sqrt{n}})
  • The width of the confidence interval depends on:
    • Sample size n
    • Population standard deviation σ
    • Confidence level (1 - \alpha)100\%.
  • To decrease the width of a confidence interval:
    1. Increase the sample size.
    2. Lower the confidence level.

Problem Example

  • A publishing company wants to know the average price of college textbooks.
  • Sample: 36 textbooks, mean price of $200.
  • Population standard deviation: $60.
  • a) Point estimate of the mean price: $200
  • b) 90% confidence interval: 200 \pm 1.645 (60/\sqrt{36})

Interval Estimation of a Population Mean (σ unknown)

  • σ known:
    • Normal distribution provides an excellent approximation for n ≥ 30. If the sample size is small, the normal distribution can still be used if the population is normally distributed.
  • σ unknown:
    • Use the sample standard deviation, S, as an estimator of σ, the normal distribution is replaced by the t-distribution.
  • Conditions for using t-distribution:
    1. The population is approximately normally distributed.
    2. The population standard deviation is unknown.

Student’s t-Distribution

  • Consider a random sample of n observations – with mean \bar{x} and standard deviation S – from a normally distributed population with mean μ
  • Then the variable follows the Student’s t distribution with (n - 1) degrees of freedom (parameter)
    • t = \frac{\bar{x} - \mu}{s/\sqrt{n}}

Student’s t-Distribution: Degrees of Freedom

  • The t-value depends on degrees of freedom (d.f.)
    • d.f. = n - 1

Student’s t Distribution

  • t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal
  • t → Z as n increases

Student’s t-Table

  • The body of the table contains t-values, not probabilities
  • Example:
    • n = 3
    • df = n - 1 = 2
    • $\alpha$ = .10
    • $\alpha/2$ =.05

Interval Estimation of a Population Mean (σ unknown)

  • General form for all confidence intervals: Point Estimate ± Margin of Error
  • Confidence interval for μ :
    X − E \le μ \le X + E
  • Confidence interval for μ, when σ is unknown: x \pm t_{n-1,\alpha/2} \frac{s}{\sqrt{n}}
    • where tn-1,α/2 is the critical value of the t-distribution with n-1 d.f. and an area of α/2 in each tail.

Problem

  • Dr. Moor wants to estimate the mean cholesterol level for all adult males in Hartford.
  • Sample: 25 adult males, mean cholesterol level is 186 with a standard deviation of 10.
  • Assume cholesterol levels are approximately normally distributed.
  • 95% confidence interval: 186 \pm 2.064 (10/\sqrt{25})

Interval Estimation of a Proportion: Large Sample

  • The n trials have to satisfy the assumptions underlying the binomial distribution.
  • The distribution of the sample proportion is approximately normal if the sample size is large (np ≥ 5 and nq ≥ 5), with standard deviation:
    • \sigma_{\hat{P}} = \sqrt{\frac{P(1 - P)}{n}}
  • We will estimate this with sample data:
    \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}

Confidence Interval Endpoints

  • The confidence interval for the population proportion is given by:
    • \hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}
    • \hat{p} - E \le p \le \hat{p} + E

Problem

  • A random sample of 100 people shows that 25 are left-handed. Construct a 95% confidence interval for the true proportion of left-handers.

Solution and Interpretation

  • We are 95% confident that the true proportion of left-handers in the population is between 16.51% and 33.49%.
    • 0.1651 < P < 0.3349
    • 0.25 \pm 1.96 \sqrt{\frac{.25(.75)}{100}}

Confidence Interval Estimation

  • Population Mean
    • σ Known
    • σ Unknown
  • Population Proportion

Confidence Intervals: Point estimate ± Margin of error

  • Confidence Interval of the Population Mean When σ Is Known:

\bar{x} \pm z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}

  • Confidence Interval of the Population Mean When σ Is Unknown:
    \bar{x} \pm t_{\frac{\alpha}{2},df} \frac{s}{\sqrt{n}}

  • Confidence Interval of the Population Proportion:
    \hat{p} \pm z_{\frac{\alpha}{2}} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}

Confidence Interval Estimate – Example Using Excel

  • A software company tracks the time that their customer service staff spends helping customers solve issues with the software and have determined that the service time distribution is normally distributed. Recently, managers selected a random sample of n =25 calls and wish to use data to develop a 95 percent confidence interval estimate for the population mean service time.
  • The sample data are: Call Time in Minutes

How to Do It in Excel?

  1. Open file.
  2. Select Data tab.
  3. Select Data Analysis > Descriptive Statistics category.
  4. Specify data range.
  5. Define Output Location.
  6. Check Summary Statistics.
  7. Check Confidence Level for Mean: 95%.
  8. Click OK.