Chapter 3: Numerical Descriptive Measures
Numerical Descriptive Measures
Learning Objectives (LOs)
LO 3.1: Calculate and interpret measures of location.
LO 3.2: Calculate and interpret measures of dispersion.
LO 3.3: Explain mean-variance analysis and the Sharpe ratio.
LO 3.4: Apply Chebyshev’s theorem, the empirical rule, and z-scores.
LO 3.5: Construct and interpret a boxplot.
LO 3.6: Calculate and interpret measures of association.
Introductory Case: Investment Decision
Scenario: Dorothy, a financial advisor, helps an inexperienced investor compare two mutual funds: Fidelity’s Growth Index (Growth) and Fidelity’s Value Index (Value), using their annual return data (e.g., -5.50% and -8.59% in 1984, 39.91% and 22.10% in 1985, 38.42% and 31.62% in 2019).
Tasks for Dorothy:
Calculate and interpret the typical return for each fund.
Calculate and interpret the investment risk for each fund.
Determine which fund offers a better return relative to its risk.
3.1 Measures of Location
Purpose: Attempt to find a typical or central value that describes a variable (e.g., typical investment return, typical graduate salary).
Key Measures Discussed: Mean, Median, Mode, Weighted Mean, Percentile.
Arithmetic Mean
Definition: The primary measure of central location, also called the mean or average. Calculated by summing all observations and dividing by the number of observations.
Population Mean ():
Formula:
: number of observations in the population.
: value of the -th observation.
Sample Mean :
Formula:
: number of observations in the sample.
Example (Growth and Value Funds): Over 36 years, the mean return for Growth was greater than Value (Growth: 15.755%, Value: 12.005%).
Caution: Investing solely based on average return can be misleading, as risk is not considered.
Sensitivity to Outliers: The mean can be highly affected by extreme values (outliers).
Median
Definition: The middle observation in a data set when arranged in ascending order. It divides the data into two equal halves.
Calculation:
Arrange observations in ascending order.
If (or ) is odd, the median is the middle observation.
If (or ) is even, the median is the average of the two middle values.
Usefulness: Particularly valuable when outliers are present, as it is less sensitive to them than the mean.
Practical Application: Often preferred for money-related variables (income, wealth, house prices) because these distributions tend to be skewed by high-value outliers.
Example (Acetech Salaries):
Salaries: 40,000, 40,000, 65,000, 90,000, 100,000, 145,000, 150,000, 550,000 (arranged).
(even), so median is average of 4th and 5th values:
This provides a better reflection of a typical salary compared to the mean () or mode ().
Mode
Definition: The observation that occurs most frequently in a data set.
Types:
Unimodal: One mode.
Multimodal: Two or more modes.
A data set can also have no mode (if all observations occur with the same frequency).
Limitations: Its usefulness diminishes with more than three modes. It doesn't always reflect the center of the variable, especially if the most frequent observation is an extreme value.
Example (Acetech Salaries): The modal salary is (occurs twice).
Categorical Variables: The mode is the only meaningful measure of central location for categorical variables (e.g., modal size
Lfor sweatshirtsS, L, L, M, S, L, M, L, L, M).
Measures of Central Location for Growth and Value Funds
Growth: Mean (15.755%), Median (15.245%), Mode (#N/A - no observation appears more than once).
Value: Mean (12.005%), Median (15.38%), Mode (#N/A).
Excel Formulas:
Mean:
=AVERAGE(range)Median:
=MEDIAN(range)Mode:
=MODE(range)(or=MODE.SNGL(range)for single mode, or=MODE.MULT(range)for multiple modes).
Excel Data Analysis Toolpak: Provides