Estimation of the Mean and Proportion

Inferential Statistics

Making statements about a population by examining sample results.
Uses sample statistics to estimate population parameters.
- Sample: Known.
- Population: Unknown, but can be estimated from sample evidence.

Estimation vs. Hypothesis Testing

Estimation: Estimate the population mean weight using the sample mean weight.
Hypothesis Testing: Use sample evidence to test the claim that the population mean weight is 120 pounds.
Drawing conclusions and/or making decisions concerning a population based on sample results.

Estimation

The assignment of value(s) to a population parameter based on a value of the corresponding sample statistics.
Estimator: A random variable that depends on sample data.
Estimate: A specific value of that random variable.

Point and Interval Estimates

Point Estimate: A single number calculated from a sample data.
Interval Estimate (Confidence Interval): An interval that is constructed around the point estimate, and it is stated that this interval is likely to contain the corresponding population parameter. A confidence interval provides additional information about variability.
- Includes a lower confidence limit and an upper confidence limit.
- Width of the confidence interval indicates variability.

Point Estimators and Point Estimates

Estimating Population Parameters with Sample Statistics:
- Mean: Sample mean $(\bar{x})$ estimates population mean $\mu$ .
- Proportion: Sample proportion $\hat{p}$ estimates population proportion $p$ .

Estimation Procedure

Select a sample.
Collect the required information from the elements of the sample.
Calculate the value of the sample statistics.
Assign value(s) to the corresponding population parameter.

Point Estimation: Properties

Sampling Distribution of the Sample Mean includes:
- Standard Error of the Mean:
  \sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

Biased and Unbiased Estimators

Unbiased Estimator: When the expected value of a sample statistics is equal to the value of the corresponding population parameter.
If $\bar{X}$ is a point estimator of a mean, the sampling distribution of the mean has the same mean as the population from which the sample is obtained.
We expect that the means of repeated random samples from a given population will be centered on the mean of this population, or $\bar{X}$ is an unbiased estimator of $\mu$ .

Confidence Intervals

95% Confidence Intervals: $P(-z \le Z \le z) = 0.95$
For m (σ known):
- Consider a standard normal random variable Z.
- $P\left(X - 1.96 \frac{\sigma}{\sqrt{n}} \le \mu \le X + 1.96 \frac{\sigma}{\sqrt{n}}\right) = 0.95$
- $X \pm 1.96 \frac{\sigma}{\sqrt{n}}$

Estimate the Error of a Population Mean (σ known)

Margin of error (maximum error) of estimate for $\mu$ is $E = z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$ .
We can assert with a probability $(1 - \alpha)$ that $|\bar{X} - \mu| \le z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$ .

Interval Estimation of a Population Mean (σ known)

The general form for all confidence intervals: Point Estimate $\pm$ Margin of Error
Confidence interval for $\mu$ : $\bar{X} - E \le \mu \le \bar{X} + E$
Confidence interval for $\mu$ , when σ is known: $\bar{x} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$
Each interval is constructed with regard to a given confidence level and is called a confidence interval.
The confidence level is denoted by $(1 - \alpha)100\%$ .
$\alpha$ – significance level (probability of error).

Finding $z_{\alpha/2}$

Consider a 95% confidence interval:
- $1 - \alpha = .95$
- $\alpha / 2 = .025$
Find $z_{.025} = \pm 1.96$ from the standard normal distribution table.

Constructing a Confidence Interval

Common values for confidence levels:
- 90%, $\alpha = 0.10$ , $\alpha / 2 = 0.05$ , $z{\alpha / 2} = z{.05} = 1.645$
- 95%, $\alpha = 0.05$ , $\alpha / 2 = 0.025$ , $z{\alpha / 2} = z{.025} = 1.96$
- 99%, $\alpha = 0.01$ , $\alpha / 2 = 0.005$ , $z{\alpha / 2} = z{.005} = 2.575$

Intervals and Level of Confidence

$100(1-\alpha)\%$ of intervals constructed contain $\mu$ ; $100(\alpha)\%$ do not.
$\bar{x} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$

Interpreting a Confidence Interval

Technically, If a large number of samples of size n are drawn from a given population, then $(1 - \alpha)100\%$ of the intervals will contain $\mu$ .
Informally, we can report with $(1 - \alpha)100\%$ confidence that $\mu$ lies in the given intervals.

Confidence Level and the Width of Confidence Interval

Width of a Confidence Interval = 2(Margin of Error) = $2(z_{\alpha/2} \frac{\sigma}{\sqrt{n}})$
The width of the confidence interval depends on:
- Sample size n
- Population standard deviation σ
- Confidence level $(1 - \alpha)100\%$ .
To decrease the width of a confidence interval:
1. Increase the sample size.
2. Lower the confidence level.

Problem Example

A publishing company wants to know the average price of college textbooks.
Sample: 36 textbooks, mean price of $200.
Population standard deviation: $60.
a) Point estimate of the mean price: $200
b) 90% confidence interval: $200 \pm 1.645 (60/\sqrt{36})$

Interval Estimation of a Population Mean (σ unknown)

σ known:
- Normal distribution provides an excellent approximation for n ≥ 30. If the sample size is small, the normal distribution can still be used if the population is normally distributed.
σ unknown:
- Use the sample standard deviation, S, as an estimator of σ, the normal distribution is replaced by the t-distribution.
Conditions for using t-distribution:
1. The population is approximately normally distributed.
2. The population standard deviation is unknown.

Student’s t-Distribution

Consider a random sample of n observations – with mean $\bar{x}$ and standard deviation S – from a normally distributed population with mean μ
Then the variable follows the Student’s t distribution with (n - 1) degrees of freedom (parameter)
- $t = \frac{\bar{x} - \mu}{s/\sqrt{n}}$

Student’s t-Distribution: Degrees of Freedom

The t-value depends on degrees of freedom (d.f.)
- d.f. = n - 1

Student’s t Distribution

t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal
t → Z as n increases

Student’s t-Table

The body of the table contains t-values, not probabilities
Example:
- n = 3
- df = n - 1 = 2
- $\alpha$ = .10
- $\alpha/2$ =.05

Interval Estimation of a Population Mean (σ unknown)

General form for all confidence intervals: Point Estimate ± Margin of Error
Confidence interval for μ :
X − E \le μ \le X + E
Confidence interval for μ, when σ is unknown: x \pm t_{n-1,\alpha/2} \frac{s}{\sqrt{n}}
- where tn-1,α/2 is the critical value of the t-distribution with n-1 d.f. and an area of α/2 in each tail.

Problem

Dr. Moor wants to estimate the mean cholesterol level for all adult males in Hartford.
Sample: 25 adult males, mean cholesterol level is 186 with a standard deviation of 10.
Assume cholesterol levels are approximately normally distributed.
95% confidence interval: $186 \pm 2.064 (10/\sqrt{25})$

Interval Estimation of a Proportion: Large Sample

The n trials have to satisfy the assumptions underlying the binomial distribution.
The distribution of the sample proportion is approximately normal if the sample size is large (np ≥ 5 and nq ≥ 5), with standard deviation:
- $\sigma_{\hat{P}} = \sqrt{\frac{P(1 - P)}{n}}$
We will estimate this with sample data:
\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}

Confidence Interval Endpoints

The confidence interval for the population proportion is given by:
- $\hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$
- $\hat{p} - E \le p \le \hat{p} + E$

Problem

A random sample of 100 people shows that 25 are left-handed. Construct a 95% confidence interval for the true proportion of left-handers.

Solution and Interpretation

We are 95% confident that the true proportion of left-handers in the population is between 16.51% and 33.49%.
- 0.1651 < P < 0.3349
- $0.25 \pm 1.96 \sqrt{\frac{.25(.75)}{100}}$

Confidence Interval Estimation

Population Mean
- σ Known
- σ Unknown
Population Proportion

Confidence Intervals: Point estimate ± Margin of error

Confidence Interval of the Population Mean When σ Is Known:

$\bar{x} \pm z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}$

Confidence Interval of the Population Mean When σ Is Unknown:
$\bar{x} \pm t_{\frac{\alpha}{2},df} \frac{s}{\sqrt{n}}$
Confidence Interval of the Population Proportion:
$\hat{p} \pm z_{\frac{\alpha}{2}} \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$

Confidence Interval Estimate – Example Using Excel

A software company tracks the time that their customer service staff spends helping customers solve issues with the software and have determined that the service time distribution is normally distributed. Recently, managers selected a random sample of n =25 calls and wish to use data to develop a 95 percent confidence interval estimate for the population mean service time.
The sample data are: Call Time in Minutes

How to Do It in Excel?

Open file.
Select Data tab.
Select Data Analysis > Descriptive Statistics category.
Specify data range.
Define Output Location.
Check Summary Statistics.
Check Confidence Level for Mean: 95%.
Click OK.

Estimation of the Mean and Proportion

Inferential Statistics

Estimation vs. Hypothesis Testing

Estimation

Point and Interval Estimates

Point Estimators and Point Estimates

Estimation Procedure

Point Estimation: Properties

Biased and Unbiased Estimators

Confidence Intervals

Estimate the Error of a Population Mean (σ known)

Interval Estimation of a Population Mean (σ known)

Finding zα/2z_{\alpha/2}zα/2​

Constructing a Confidence Interval

Intervals and Level of Confidence

Interpreting a Confidence Interval

Confidence Level and the Width of Confidence Interval

Problem Example

Interval Estimation of a Population Mean (σ unknown)

Student’s t-Distribution

Student’s t-Distribution: Degrees of Freedom

Student’s t Distribution

Student’s t-Table

Interval Estimation of a Population Mean (σ unknown)

Problem

Interval Estimation of a Proportion: Large Sample

Confidence Interval Endpoints

Problem

Solution and Interpretation

Confidence Interval Estimation

Confidence Intervals: Point estimate ± Margin of error

Confidence Interval Estimate – Example Using Excel

How to Do It in Excel?

Finding $z_{\alpha/2}$