Measures of Dispersion — Section 3.2 Notes

Range

Definition: The range measures the spread between the smallest and largest values in a data set.
Formula: R = \text{Largest Value} - \text{Smallest Value}
Example: Data set 125, 175, 200, 225, 250
- Largest = 250, Smallest = 125
- Range = 250 − 125 = 125
Pros and cons:
- Pros: Very simple and quick to compute
- Cons: Only uses two data values (extremes)
- Susceptible to extreme values; NOT resistant to outliers

Standard deviation

Concept:
- Based on deviations from the mean
- Deviations sum to zero by definition
- Describes the "typical" deviation from the mean
- Other options (MAD, variance) exist but are less intuitive for describing spread around the mean
Population standard deviation
- Formula: \sigma = \sqrt{\frac{1}{N} \sum{i=1}^{N} (xi - \mu)^2}
- Interpretation: Average squared deviation from the mean, then take the square root
Sample standard deviation
- Formula: s = \sqrt{\frac{1}{n-1} \sum{i=1}^{n} (xi - \bar{x})^2}
- Uses n−1 in the denominator to make the estimator unbiased for the population SD
Hand calculation illustration (population, N=5): data 1, 2, 3, 4, 5
- Mean: \mu = \frac{1+2+3+4+5}{5} = 3
- Deviations: (-2,-1,0,1,2)
- Squared deviations: (4,1,0,1,4)
- Sum: \sum (x_i-\mu)^2 = 10
- Population variance: \sigma^2 = \frac{10}{5} = 2
- Population SD: \sigma = \sqrt{2} \approx 1.4142
Example with a sample (same data, n=5):
- Sum of squared deviations: 10
- Sample variance: s^2 = \frac{10}{n-1} = \frac{10}{4} = 2.5
- Sample SD: s = \sqrt{2.5} \approx 1.5811
Calculation tables: Conceptual tool to organize the steps (x, x−μ, (x−μ)²) for both population and sample calculations

A few notes about standard deviation

Rounding:
- Be careful with rounding intermediate calculations; small changes can affect the final SD value
Interpretation:
- Large deviations from the mean lead to a larger SD; small deviations lead to a smaller SD
Resistance:
- Standard deviation is NOT resistant to outliers; a few extreme values can substantially increase SD
n−1 denominator (why):
- The use of n−1 makes the estimator of the population variance unbiased when sampling

Comparing standard deviations

When to compare:
- If two samples/populations share the same units and are on the same scale, SDs can be directly compared
When units differ:
- Use the coefficient of variation (CV), a unitless measure of relative spread
Coefficient of variation (CV):
- Formula: \text{CV} = \frac{\sigma}{|\mu|}
- Interpretation: SD expressed as a fraction of the mean; enables comparison across different units/scales

Variance

Relationship to SD:
- Variance is the square of the standard deviation
- Population variance: \sigma^2 = \frac{1}{N} \sum{i=1}^{N} (xi - \mu)^2
- Sample variance: s^2 = \frac{1}{n-1} \sum{i=1}^{n} (xi - \bar{x})^2
Interpretation:
- In the hand-calculation context, variance is the quantity inside the square root
- Units are squared, which can make interpretation less intuitive
Role in inferential statistics:
- Variance is foundational for many inferential procedures, including hypothesis testing and confidence intervals

Notes about StatCrunch

Mean:
- The mean formula is the same for samples and populations; StatCrunch treats the mean as the same quantity in either case
Standard deviation in StatCrunch:
- The default “standard deviation” in StatCrunch is the sample SD
- To obtain the population SD, choose “unadj. standard deviation”
Practical takeaway for exams:
- Know how to compute by hand (sum of squared deviations, divide by N or n−1, take the square root)

The empirical rule (68-95-99.7 rule)

Typically described for bell-shaped (normal) distributions:
- About 68% of data lie within 1 standard deviation of the mean
- About 95% lie within 2 standard deviations
- About 99.7% lie within 3 standard deviations
The empirical rule also aligns with the idea of a normal distribution where most data cluster near the mean
Visual shorthand:
- 1 SD: 68%
- 2 SD: 95%
- 3 SD: 99.7%
Special tail details (for normal):
- About 0.15% lie beyond 3 SD on each tail (total ~0.3% beyond |Z|>3)
- About 2.35% lie beyond 2 SD on each tail (total ~4.7% beyond |Z|>2)

Example: Bell-shaped distribution with mean 150 and standard deviation 15

68% of observations lie between which values?
- Between 150 - 15 = 135 and 150 + 15 = 165
- Interval: [135, 165]
95% of observations lie between which values?
- Between 150 - 2\times 15 = 120 and 150 + 2\times 15 = 180
- Interval: [120, 180]
What percentage lie between 105 and 195?
- This is within 3 standard deviations: 150 \pm 3\cdot 15 = 105, 195
- Percentage: approximately 99.7%

Chebyshev’s inequality

Scope:
- Applies to any distribution, not just bell-shaped
Statement:
- For any distribution with mean (\mu) and standard deviation (\sigma), at least \left(1 - \frac{1}{k^2}\right) \times 100\% of observations lie within (k) standard deviations of the mean
Key takeaway:
- Guarantees a minimum proportion of observations within (k\sigma) of the mean, regardless of distribution shape

Example: Chebyshev (mean = 150, SD = 15)

Question: At least what percentage lie between 120 and 180?
- Here, 120 and 180 are within 2 standard deviations of the mean ((k=2))
- Minimum percentage: \left(1 - \frac{1}{2^2}\right) \times 100\% = \left(1 - \frac{1}{4}\right) \times 100\% = 75\%

Empirical rule vs Chebyshev: a quick comparison

Distribution requirements:
- Empirical rule assumes a bell-shaped (normal) distribution
- Chebyshev’s inequality places no assumption on shape; it applies to any distribution
Nature of the statement:
- Empirical rule gives approximate percentages for normal data
- Chebyshev provides a guaranteed minimum ("at least"), not an exact proportion
Practical interpretation:
- If you know the data are roughly normal, use the empirical rule for quick estimates
- If the distribution shape is unknown or non-normal, use Chebyshev to obtain a conservative bound

Quick recap and study-ready takeaways

Range: simplest spread measure; sensitive to outliers
Standard deviation: core measure of spread around the mean; population SD uses N in the denominator, sample SD uses n−1
Variance: the squared SD; easier to manipulate in algebraic/statistical derivations; units are squared
Coefficient of variation: unitless way to compare spread across different means/units
The empirical rule: best for normal distributions; gives quick spread estimates
Chebyshev’s inequality: universal spread bound; useful when distribution is unknown
When using software (e.g., StatCrunch): know which SD you’re getting by default and how to obtain the population version if needed