7.1: Statistics and Parameters

Statistics and Parameters

  • Statistic: a number that describes some characteristic of the sample   * Relevant symbols     * Mean: x̄     * Proportion: p̂     * Standard deviation: S
  • Parameter: a number that describes some characteristic of the population   * Relevant symbols     * Mean: μ     * Proportion: p     * Standard deviation: σ
  • Samples are taken to try to estimate the population (μ or p)
  • Ultimate goal: Estimate parameters based on statistics

Distribution, Variability, and Bias

  • Sampling distribution: the distribution of values taken by the statistic in all possible samples of the same size from the same population   * Eg. sample mean, proportion
  • Sampling variability: how much results vary between samples   * Every time a sample is taken, the results will vary   * Measured using the spread of the random sample   * Based primarily on the size of the random sample
  • As a general rule of thumb,   * Larger sample: less variability   * Smaller sample: more variability
  • The spread does not depend on the size of the population, as long as it is at least 10x larger than the sample
  • Biased statistic: a statistic which consistently overestimates or underestimates the parameter (mean or proportion)
  • Unbiased statistic: a statistic in which the distribution of samples is centered around the true population’s parameter

Bias vs. variance

Means and Proportions

  • Proportion problems generally involve categorical variables
  • Binomial distributions will become approximately normal distributions if np≥10 and n(1-p)≥10   * n = sample size   * p = population proportion   * p̂ = sample proportion   * Normal distribution means the use of z-scores as a standard measure

 

  • Statistics are unbiased if they are equal to the true parameters, so p̂ is an unbiased estimator of p
Verifying Conditions
  • If an SRS is taken of size n from a large population with proportion p,   * Some conditions must be stated and checked     * Is the population more than 10x larger than the sample size?       * The 10% condition verifies that the standard deviation formula may be used     * Is np ≥ 10 and is n(1-p) ≥ 10?       * This verifies that a normal approximation may be used
  • If these conditions are met, the mean of the sample proportions will equal the true population proportion

\