If $n$ is even, the median is the average of the two middle values.
Measures of Location - Population Mean and Median
Generally, the population mean and median will not be identical.
Skewness: If the population distribution is positively or negatively skewed, then:
mean ≠ median
Important considerations for making inferences based on population characteristics involve deciding which characteristic (mean or median) is more relevant.
Quartiles and Percentiles
The median divides the data set into two equal parts.
Quartiles: Divide the data set into four equal parts:
First quartile ($Q_1$): 25th percentile
Second quartile ($Q_2$): Median (50th percentile)
Third quartile ($Q_3$): 75th percentile
Percentiles: For finer measures, percentiles divide the data into 100 parts. E.g., the 99th percentile separates the highest 1% from the bottom 99%.
The Trimmed Mean
The trimmed mean excludes the first $k$ and last $(n-k)$ order statistics to reduce the impact of outliers.
Robustness: Trimmed means are not unduly affected by extreme values.
Example: Judges' scores in sports where extreme scores are discarded before calculation.
Measures of Variability
Reporting a measure of center (mean or median) gives partial information about data sets.
Samples can have the same central measures but different spreads.
Visual Representation: Dot plots may show varying extents of spread even with identical means and medians.
Descriptive Statistical Measures for Variability
Types of Measures:
Variance
Standard Deviation
Interquartile Range (IQR)
Range
Quartile Deviation
Measures of Variability - The Range
The range is the difference between the largest and smallest sample values:
R = x{max} - x{min}
Adequate for small data sets but not comprehensive.
Measures of Variability - The Interquartile Range
The Interquartile Range (IQR) is defined as:
IQR = Q3 - Q1
Where $Q3$ is the median of the upper half and $Q1$ is the median of the lower half of the data set.
Variance and Standard Deviation
Population Variance ($\sigma^2$) and Sample Variance ($s^2$):
Population variance formula:
\sigma^2 = \frac{1}{N} \sum{i=1}^{N} (xi - \mu)^2
Standard Deviation is the square root of variance.
Variance with a Constant
If $y = cx + d$, where $c$ is a constant:
Sample Variance of $y$:
sy^2 = c^2 sx^2
Boxplots
A boxplot is based on measures that remain stable in the presence of a few outliers, specifically the median and a measure of spread known as the fourth spread.