Descriptive Statistics: Percentiles, Z-Scores, and the Empirical Rule

Measures of Center
  • Mean (average): xˉ=1n<em>i=1nx</em>i\bar{x} = \frac{1}{n} \sum<em>{i=1}^{n} x</em>i
  • Median: Middle value when data are ordered; average of two middle values if even count.
  • Mode: Most frequently occurring value.
Measures of Variation (Spread)
  • Range: Range=x<em>maxx</em>min\text{Range} = x<em>{\max} - x</em>{\min}
  • Interquartile Range (IQR): IQR=Q3Q1\text{IQR} = Q3 - Q1 (Q1 = median of lower half, Q3 = median of upper half).
  • Standard Deviation (spread about the center):
    • Sample: s=1n1<em>i=1n(x</em>ixˉ)2s = \sqrt{\frac{1}{n-1} \sum<em>{i=1}^{n} (x</em>i - \bar{x})^2}
    • Population: σ=1N<em>i=1N(x</em>iμ)2\sigma = \sqrt{\frac{1}{N} \sum<em>{i=1}^{N} (x</em>i - \mu)^2}
Percentile
  • Definition: A value below which a given percentage of observations fall. E.g., 20th percentile means better than 20% of data.
  • Computation: Count dots smaller than target (xx), total dots (nn). Percentile = xn×100\frac{x}{n} \times 100. Round down to nearest integer.
Z-Score (Standard Score)
  • Definition: How many standard deviations an observation is from the mean.
  • Formula: z=xμσz = \frac{x - \mu}{\sigma}
  • Interpretation: Positive z means above mean, negative z means below mean. Magnitude indicates distance in standard deviations.
  • Why use: Enables comparison of values from different distributions (different scales/units).
Empirical Rule (68-95-99.7)
  • Applies to approximately normal distributions.
  • About 68% of data within μ±σ\mu \pm \sigma.
  • About 95% within μ±2σ\mu \pm 2\sigma.
  • About 99.7% within μ±3σ\mu \pm 3\sigma.
Summary Takeaways
  • Descriptive statistics include measures of center (mean, median, mode) and variation (range, IQR, standard deviation).
  • Percentiles describe relative standing; z-scores describe distance from the mean in standard deviations.
  • Use z-scores or percentiles for fair comparisons across different distributions.
  • Excel (AVERAGE, MEDIAN, STDEV.S, QUARTILE.EXC, STANDARDIZE) automates these calculations.