Descriptive Statistics: Percentiles, Z-Scores, and the Empirical Rule
Measures of Center
- Mean (average): xˉ=n1∑<em>i=1nx</em>i
- Median: Middle value when data are ordered; average of two middle values if even count.
- Mode: Most frequently occurring value.
Measures of Variation (Spread)
- Range: Range=x<em>max−x</em>min
- Interquartile Range (IQR): IQR=Q3−Q1 (Q1 = median of lower half, Q3 = median of upper half).
- Standard Deviation (spread about the center):
- Sample: s=n−11∑<em>i=1n(x</em>i−xˉ)2
- Population: σ=N1∑<em>i=1N(x</em>i−μ)2
Percentile
- Definition: A value below which a given percentage of observations fall. E.g., 20th percentile means better than 20% of data.
- Computation: Count dots smaller than target (x), total dots (n). Percentile = nx×100. Round down to nearest integer.
Z-Score (Standard Score)
- Definition: How many standard deviations an observation is from the mean.
- Formula: z=σx−μ
- Interpretation: Positive z means above mean, negative z means below mean. Magnitude indicates distance in standard deviations.
- Why use: Enables comparison of values from different distributions (different scales/units).
Empirical Rule (68-95-99.7)
- Applies to approximately normal distributions.
- About 68% of data within μ±σ.
- About 95% within μ±2σ.
- About 99.7% within μ±3σ.
Summary Takeaways
- Descriptive statistics include measures of center (mean, median, mode) and variation (range, IQR, standard deviation).
- Percentiles describe relative standing; z-scores describe distance from the mean in standard deviations.
- Use z-scores or percentiles for fair comparisons across different distributions.
- Excel (AVERAGE, MEDIAN, STDEV.S, QUARTILE.EXC, STANDARDIZE) automates these calculations.