Mean, Median, and Standard Deviation: Lecture Notes

Data/context from the problem

Given problem pieces: mean and standard deviation are the key summaries for the data over time.
- Mean: $\mu = 38\ \text{months}$
- Standard deviation: $\sigma = 6.5\ \text{months}$
These two values are important for the whole semester and for interpreting the data.
They form the basis of the discussions on center, spread, and sampling variability.

The Mean: first moment, center of mass, and its properties

Origin and purpose
- Historically rooted in physics. Physicists sought a balance point (center of mass) on a lever, not just a representative data piece.
- The mean is the center of mass of the data when treated as a physical system of equal-mass points.
Mathematical definition (population mean)
- The mean is the average of the data: $\mu = \frac{1}{n} \sum{i=1}^n xi.$
A key physical property
- The mean balances deviations: the sum of all deviations from the mean equals zero:
 $\sum{i=1}^n (xi - \mu) = 0.$
- This balancing property is why the mean serves as a natural equilibrium point in many analyses.
Sensitivity to outliers
- The mean is sensitive to outliers: extreme values pull the average toward them (e.g., a few very high scores raise the average).
- Example scenario: if a few students score extremely high, the average increases, even if most scores are moderate.
The mean as a measure of central tendency
- In statistics, the mean is a center measure or a measure of central tendency.
- In physics, it is literally the center of mass; in statistics, it is the balance point of the data set.
Calculating with data and interpretation
- Different data sets can have the same mean but different spreads (variation around the mean).
Related terms
- Center of a measure of center; center of mass; first moment.

The Median: a robust alternative to the mean

What is the median?
- The middle value when the data are ordered.
- For a sample, you sort the data and pick the middle value (or the average of the two middle values if even n).
Sensitivity to outliers
- The median is not as sensitive to outliers as the mean; it resists a few extreme values shifting the central value.
Practical use in grading (from the lecture)
- Instructors sometimes use the median to avoid distortion when a few students spike the average.

The Standard Deviation: the second moment and the measure of spread

Motivation: same mean can come from different data shapes
- You can have two datasets with the same mean but different variability; the spread matters for interpretation.
From deviation sums to a usable spread measure
- Deviations from the mean by themselves sum to zero, so you cannot simply average the signed deviations to get spread.
The second moment and why squaring deviations
- To avoid the cancellation of positive and negative deviations, square each deviation:
- Deviation for a point: $di = xi - \mu$
- Squared deviation: $di^2 = (xi - \mu)^2$
- The sum of squared deviations is no longer zero, providing a meaningful measure of spread.
How the standard deviation is defined (population vs. sample)
- Population standard deviation:
 $\sigma = \sqrt{\frac{1}{n} \sum{i=1}^n (xi - \mu)^2}.$
- Sample standard deviation (unbiased estimator):
 $s = \sqrt{\frac{1}{n-1} \sum{i=1}^n (xi - \bar{x})^2}.$
- NOTE: In many course discussions, the data are treated as samples; the mean used is \bar{x} with the corresponding sample standard deviation s.
The connection to inertia and physics
- The term “second moment” relates to inertia in physics, analogous to moment of inertia, which measures resistance to change in rotation. Here it is an analogous measure of spread around the mean.
Relationship to the Gaussian (normal) distribution
- A common idealized data shape is the Gaussian distribution (bell curve). Data with small, large, or uneven spread around the mean lead to different bell curves with the same center but different width.
- The density of the normal distribution is given by:
  $f(x\mid \mu,\sigma) = \frac{1}{\sqrt{2\pi}\,\sigma}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right).$
Practical implications
- Larger \sigma indicates more spread; smaller \sigma indicates data points are closer to the mean.
- Standard deviation is widely used to quantify uncertainty and variability in measurements.

Connecting the concepts: mean, spread, and distribution shape

Different data sets can share the same mean but differ in spread, affecting interpretation and risk/uncertainty assessments.
The Gaussian distribution is a central model because many natural processes approximate it due to the central limit theorem and the aggregation of many small, independent effects.
Outliers influence both mean and standard deviation, but not as strongly on the median; the two measures (mean and median) together provide a fuller picture.

Practical takeaways and examples

Given a problem with mean $\mu = 38\text{ months}$ and standard deviation $\sigma = 6.5\text{ months}$ :
- The data are centered around 38 months with moderate spread of about 6.5 months.
- If you add high outliers, expect the mean to shift upward more than the median.
- If you care about fairness or robustness to extreme values (e.g., grading), consider the median as a complementary statistic.
Summary of core ideas
- Mean is the first moment and the center of mass; it balances deviations to zero:
 $\sum{i=1}^n (xi - \mu) = 0.$
- The standard deviation is the square root of the average of squared deviations, i.e., the second moment, providing a measure of spread around the mean:
- Population: $\sigma = \sqrt{\frac{1}{n} \sum (x_i - \mu)^2}$
- Sample: $s = \sqrt{\frac{1}{n-1} \sum (x_i - \bar{x})^2}$
- Outliers can heavily influence the mean (and to a degree, the standard deviation), while the median remains robust to such extremes.
- The bell curve (Gaussian) is the common reference shape for symmetric data around the mean, with spread governed by the standard deviation.

Notation summary (quick reference)

Data points: $x_i, i=1..n$
Mean (population): $\mu = \frac{1}{n} \sum{i=1}^n xi$
Deviation: $di = xi - \mu$
Sum of deviations: $\sum{i=1}^n di = 0$
Squared deviations: $di^2 = (xi - \mu)^2$
Variance (population): $\sigma^2 = \frac{1}{n} \sum{i=1}^n (xi - \mu)^2$
Standard deviation (population): $\sigma = \sqrt{\sigma^2}$
Sample mean: $\bar{x} = \frac{1}{n} \sum{i=1}^n xi$
Sample variance: $s^2 = \frac{1}{n-1} \sum{i=1}^n (xi - \bar{x})^2$
Sample standard deviation: $s = \sqrt{s^2}$

Connections to prior topics and real-world relevance

This lecture ties to foundational ideas of statistics: central tendency (mean, median, mode) and dispersion (range, variance, standard deviation).
The physics analogy (center of mass and inertia) helps intuition for why the mean and squared deviations are used.
In real-world data analysis, choosing between mean vs median and considering standard deviation vs robust alternatives informs decision-making under uncertainty and when dealing with outliers or skewed data.