7.1: Statistics and Parameters
Statistics and Parameters
- Statistic: a number that describes some characteristic of the sample
- Relevant symbols
- Mean: x̄
- Proportion: p̂
- Standard deviation: S
- Parameter: a number that describes some characteristic of the population
- Relevant symbols
- Mean: μ
- Proportion: p
- Standard deviation: σ
- Samples are taken to try to estimate the population (μ or p)
- Ultimate goal: Estimate parameters based on statistics
Distribution, Variability, and Bias
- Sampling distribution: the distribution of values taken by the statistic in all possible samples of the same size from the same population
- Eg. sample mean, proportion
- Sampling variability: how much results vary between samples
- Every time a sample is taken, the results will vary
- Measured using the spread of the random sample
- Based primarily on the size of the random sample
- As a general rule of thumb,
- Larger sample: less variability
- Smaller sample: more variability
- The spread does not depend on the size of the population, as long as it is at least 10x larger than the sample
- Biased statistic: a statistic which consistently overestimates or underestimates the parameter (mean or proportion)
- Unbiased statistic: a statistic in which the distribution of samples is centered around the true population’s parameter
Bias vs. variance
Means and Proportions
Proportion problems generally involve categorical variables
Binomial distributions will become approximately normal distributions if np≥10 and n(1-p)≥10
- n = sample size
- p = population proportion
- p̂ = sample proportion
- Normal distribution means the use of z-scores as a standard measure
Statistics are unbiased if they are equal to the true parameters, so p̂ is an unbiased estimator of p
Verifying Conditions
- If an SRS is taken of size n from a large population with proportion p,
- Some conditions must be stated and checked
- Is the population more than 10x larger than the sample size?
- The 10% condition verifies that the standard deviation formula may be used
- Is np ≥ 10 and is n(1-p) ≥ 10?
- This verifies that a normal approximation may be used
- If these conditions are met, the mean of the sample proportions will equal the true population proportion