Numerical Measures in Business Statistics

Outline:
- What are measures?
- Measures of Data Center and Location
- Measures of Data Variation
Learning Outcomes: Students will be able to:
- Compute and describe data using the mean, median, mode, and weighted mean.
- Compute and describe data using range, interquartile range, variance, and standard deviation.
- Compute and use z-score and the coefficient of variation to describe data.
- Apply the Empirical Rule and Tchebysheff's theorem.

Measure:
- A quantitative value that describes a particular characteristic of a dataset.
- Useful for summarizing and interpreting data.
- Allows complex datasets to be distilled into a single useful number.

Parameter:
- A measure computed from the entire population.
- This quantitative value is constant if the population does not change.
- Usually denoted with a Greek character (e.g., $\mu$ for population mean, $\sigma$ for population standard deviation).
Statistic:
- A measure computed from a sample of a population.
- Will vary based on the specific sample taken.
- Usually denoted with a Roman character (e.g., $\bar{x}$ for sample mean, $s$ for sample standard deviation).

A parameter.
Formula: $\mu = \frac{\sum{i=1}^{N} xi}{N}$
- $\mu$ : Population mean (pronounced "mu")
- $N$ : Population size
- $x_i$ : $i^{th}$ individual value of the variable
Example #1: Sales prices: $\$9,000, \ \$10,000, \ \$21,000$ . For $N=3$ .
- $\mu = \frac{\$9,000 + \$10,000 + \$21,000}{3} = \frac{\$40,000}{3} = \$13,333.33$
Example #2: Sales prices: $\$8,000, \ \$9,000, \ \$10,000, \ \$41,000$ . For $N=4$ .
- $\mu = \frac{\$8,000 + \$9,000 + \$10,000 + \$41,000}{4} = \frac{\$68,000}{4} = \$17,000$

A statistic.
Formula: $\bar{x} = \frac{\sum{i=1}^{n} xi}{n}$
- $\bar{x}$ : Sample mean (pronounced "x-bar")
- $n$ : Sample size
- $x_i$ : $i^{th}$ individual value of the variable
Example #1: Cars sold daily: $28, 12, 6, 4, 5, 15, 10$ . For $n=7$ .
- $\bar{x} = \frac{28+12+6+4+5+15+10}{7} = \frac{80}{7} \approx 11.43$ cars

Another center measure, less impacted by outliers than the mean.
Divides a data array into two equal halves.
=median(range) in Excel.
To Compute the Median:
1. Sort the data in ascending order.
2. Calculate the median's index ( $i$ ).
  - For population data index ( $N$ ): $i = rac{1}{2}N$
  - For sample data index ( $n$ ): $i = rac{1}{2}n$
3. If index $i$ is not an integer: Round up to the next integer. The median is the value at this rounded index position.
4. If index $i$ is an integer: The median is the average of the values in index positions $i$ and $i+1$ .
Example #1 (Odd N): Prices: $\$9,000, \ \$10,000, \ \$21,000$ . ( $N=3$ )
1. Sorted: $\$9,000, \ \$10,000, \ \$21,000$
2. Index: $i = \frac{1}{2}(3) = 1.5$ . Round up to $2$ .
3. Median is the $2^{nd}$ value: $\$10,000$ .
Example #2 (Odd N): Cars sold: $4, 5, 6, 10, 12, 15, 28$ . (Sorted, $n=7$ )
1. Index: $i = \frac{1}{2}(7) = 3.5$ . Round up to $4$ .
2. Median is the $4^{th}$ value: $10$ cars.
Example #3 (Even N): Cars sold: $4, 5, 6, 10, 12, 15, 28, 500$ . (Sorted, $n=8$ )
1. Index: $i = \frac{1}{2}(8) = 4$ . (Integer)
2. Median is the average of values at positions $4$ and $5$ .
3. Values: $10$ and $12$ . Median $= \frac{10+12}{2} = 11$ cars.

Comparing the mean and median can reveal insights about the dataset's distribution, especially regarding extreme values:
- Mean > Median: The dataset is skewed to the right (positive skew), indicating the presence of high extreme values.
- Mean < Median: The dataset is skewed to the left (negative skew), indicating the presence of low extreme values.
- Mean = Median: The dataset is evenly spread and symmetric.

Another measure of central location, though not as common as mean or median.
Counts the most frequent value(s) in the dataset.
A dataset can have multiple modes (bimodal, multimodal) or no mode.
Note: The mode need not be in the center and may not always reflect the center of the data set; interpretation requires care.
Example #1 (Single Mode): Daily car sales: $3, 3, 4, 5, 7, 7, 7, 9, 10, 10, 11, 11, 13, 13, 15$
- The value $7$ appears 3 times, which is more than any other value. Mode = $7$ .
Example #2 (Multiple Modes): Song plays: $0, 0, 1, 1, 1, 2, 3, 5, 5, 5$
- The values $1$ and $5$ both appear 3 times. Modes = $1, 5$ .
Center Considerations:
- Datasets with no value(s) occurring more frequently than others have no mode.
- The mode may not necessarily be close to the mean or median (e.g., for dataset ${3, 3, 3, 3, 3, 8, 9, 12, 16, 16, 16}$ , mode is 3, median is 8, mean is 8.4).

Used when some data values are more important or occur more frequently than others.
Also known as weighted average.

Formula: $\muw = \frac{\sum wi xi}{\sum wi}$
- $w_i$ : The weight of the $i^{th}$ data value
- $x_i$ : The $i^{th}$ data value

Formula: $\bar{x}w = \frac{\sum wi xi}{\sum wi}$
- $w_i$ : The weight of the $i^{th}$ data value
- $x_i$ : The $i^{th}$ data value
Example #1 (Income):
- Worker 1: 50 days, $\$1000$ . $w1 = 50, x1 = 1000$
- Worker 2: 20 days, $\$500$ . $w2 = 20, x2 = 500$
- Worker 3: 30 days, $\$800$ . $w3 = 30, x3 = 800$
- $\bar{x}_w = \frac{(50 \times \$1000) + (20 \times \$500) + (30 \times \$800)}{50+20+30} = \frac{\$50,000 + \$10,000 + \$24,000}{100} = \frac{\$84,000}{100} = \$840$
Example #2 (Antique Store):
- Cust 1: 5 items, $\$100$ . $w1 = 5, x1 = 100$
- Cust 2: 3 items, $\$50$ . $w2 = 3, x2 = 50$
- Cust 3: 4 items, $\$80$ . $w3 = 4, x3 = 80$
- $\bar{x}_w = \frac{(5 \times \$100) + (3 \times \$50) + (4 \times \$80)}{5+3+4} = \frac{\$500 + \$150 + \$320}{12} = \frac{\$970}{12} \approx \$80.83$