7.1: Statistics and Parameters

Statistics and Parameters

  • Statistic: a number that describes some characteristic of the sample
    • Relevant symbols
    • Mean: x̄
    • Proportion: p̂
    • Standard deviation: S
  • Parameter: a number that describes some characteristic of the population
    • Relevant symbols
    • Mean: μ
    • Proportion: p
    • Standard deviation: σ
  • Samples are taken to try to estimate the population (μ or p)
  • Ultimate goal: Estimate parameters based on statistics

Distribution, Variability, and Bias

  • Sampling distribution: the distribution of values taken by the statistic in all possible samples of the same size from the same population
    • Eg. sample mean, proportion
  • Sampling variability: how much results vary between samples
    • Every time a sample is taken, the results will vary
    • Measured using the spread of the random sample
    • Based primarily on the size of the random sample
  • As a general rule of thumb,
    • Larger sample: less variability
    • Smaller sample: more variability
  • The spread does not depend on the size of the population, as long as it is at least 10x larger than the sample
  • Biased statistic: a statistic which consistently overestimates or underestimates the parameter (mean or proportion)
  • Unbiased statistic: a statistic in which the distribution of samples is centered around the true population’s parameter

Bias vs. variance

Means and Proportions

  • Proportion problems generally involve categorical variables

  • Binomial distributions will become approximately normal distributions if np≥10 and n(1-p)≥10

    • n = sample size
    • p = population proportion
    • p̂ = sample proportion
    • Normal distribution means the use of z-scores as a standard measure

  • Statistics are unbiased if they are equal to the true parameters, so p̂ is an unbiased estimator of p

Verifying Conditions

  • If an SRS is taken of size n from a large population with proportion p,
    • Some conditions must be stated and checked
    • Is the population more than 10x larger than the sample size?
      • The 10% condition verifies that the standard deviation formula may be used
    • Is np ≥ 10 and is n(1-p) ≥ 10?
      • This verifies that a normal approximation may be used
  • If these conditions are met, the mean of the sample proportions will equal the true population proportion

\