Sample statistic: A random variable whose value depends on which population items are included in the random sample.
Sampling distribution: A probability distribution of all possible values of a sample statistic for a given sample size selected from a population.
The sample statistic's ability to represent the population accurately depends on the sample size.
Sampling variation is illustrated using eight random samples of size n=5 from a large population of GMAT scores.
Estimators and Sampling Distributions
Estimator: A statistic derived from a sample to infer the value of a population parameter.
Estimate: The value of the estimator in a particular sample.
Population parameters are represented by Greek letters, while corresponding statistics are represented by Roman letters.
Sample mean (xˉ) is the estimator for the population mean (μ).
Sample proportion (p) is the estimator for the population proportion (π).
Sample standard deviation (s) is the estimator for the population standard deviation (σ).
Sampling error: The difference between an estimate and the corresponding population parameter. For example:
SamplingError=xˉ−μ
Bias: The difference between the expected value of the estimator and the true parameter.
Bias=E(Xˉ)−μ
An estimator is unbiased if its expected value is the parameter being estimated.
The sample mean is an unbiased estimator of the population mean since: E(Xˉ)=μ. On average, an unbiased estimator neither overstates nor understates the true parameter.
Sample Mean Sampling Distribution: Standard Error of the Mean
Different samples of the same size from the same population will yield different sample means.
Standard Error of the Mean: A measure of the variability in the sample means (from a theoretical distribution of all possible sample means of sample size n) from sample to sample.
σxˉ=nσ, assuming sampling with replacement or without replacement from an infinite population.
The standard error of the mean decreases as the sample size increases.
Sample Mean Sampling Distribution: If the Population is Normal
If a population is normal with mean μ and standard deviation σ, the sampling distribution of xˉ is also normally distributed with:
μxˉ=μ
σxˉ=nσ
Z-value for Sampling Distribution of the Mean:
Z=σxˉXˉ−μ=nσXˉ−μ
Where:
Xˉ = sample mean
μ = population mean
σ = population standard deviation
n = sample size
Determining An Interval Including A Fixed Proportion of the Sample Means
To find a symmetrically distributed interval around μ that will include 95% of the sample means when μ=368, σ=15, and n=25:
Since the interval contains 95% of the sample means, 5% of the sample means will be outside the interval.
Since the interval is symmetric, 2.5% will be above the upper limit, and 2.5% will be below the lower limit.
From the standardized normal table, the Z score with 2.5% (0.0250) below it is -1.96, and the Z score with 2.5% (0.0250) above it is 1.96.
Calculating the lower limit of the interval:
XˉL=μ+Zscore(nσ)=368+(−1.96)(2515)=362.12
Calculating the upper limit of the interval:
XˉU=μ+Zscore(nσ)=368+(1.96)(2515)=373.88
Based on samples of size 25, the sample means in 95% of all samples are between 362.12 and 373.88.
Generalized Equation for the interval that contains some defined percentage of all sample means:
μ±Zscore(nσ)
Population Proportions
π = the proportion of the population having some characteristic.
σ2 of a proportion is defined as π(1−π), so:
σ of a proportion is defined as π(1−π)
Sample proportion (p) provides an estimate of π:
p = \frac{X}{n} = \frac{# \ of \ items \ in \ the \ sample \ of \ interest}{sample \ size}
0≤p≤1
p is approximately distributed as a normal distribution when n is large (assuming sampling with replacement from a finite population or without replacement from an infinite population).
Sampling Distribution of p
Sampling Distribution of p is approximated by a normal distribution if:
n\pi > 5 and n(1 - \pi) > 5
Where: μ<em>p=π and σ</em>p=nπ(1−π)
Where: π is the population proportion.
Z-Value for Proportions
Standardize p to a Z value with the formula:
Z=σpp−π=nπ(1−π)p−π=nπ(1−π)p−π
Example: If the true proportion of voters who support Proposition A is π=0.4, what is the probability that a sample of size 200 yields a sample proportion between 0.40 and 0.45?