Numerical Measures in Business Statistics

Numerical Measures in Business Statistics

Outline and Learning Outcomes

  • Outline:

    • What are measures?

    • Measures of Data Center and Location

    • Measures of Data Variation

  • Learning Outcomes: Students will be able to:

    • Compute and describe data using the mean, median, mode, and weighted mean.

    • Compute and describe data using range, interquartile range, variance, and standard deviation.

    • Compute and use z-score and the coefficient of variation to describe data.

    • Apply the Empirical Rule and Tchebysheff's theorem.

Measures of Center and Location

  • Measure:

    • A quantitative value that describes a particular characteristic of a dataset.

    • Useful for summarizing and interpreting data.

    • Allows complex datasets to be distilled into a single useful number.

Key Terms: Parameter and Statistic
  • Parameter:

    • A measure computed from the entire population.

    • This quantitative value is constant if the population does not change.

    • Usually denoted with a Greek character (e.g., μ\mu for population mean, σ\sigma for population standard deviation).

  • Statistic:

    • A measure computed from a sample of a population.

    • Will vary based on the specific sample taken.

    • Usually denoted with a Roman character (e.g., xˉ\bar{x} for sample mean, ss for sample standard deviation).

Mean (Average)
  • The most commonly used measure of central tendency.

  • Helps to describe the center of a dataset.

  • =average(range) in Excel.

Population Mean (μ\mu)
  • A parameter.

  • Formula: μ=<em>i=1Nx</em>iN\mu = \frac{\sum<em>{i=1}^{N} x</em>i}{N}

    • μ\mu: Population mean (pronounced "mu")

    • NN: Population size

    • xix_i: ithi^{th} individual value of the variable

  • Example #1: Sales prices: $9,000, $10,000, $21,000\$9,000, \ \$10,000, \ \$21,000. For N=3N=3.

    • μ=$9,000+$10,000+$21,0003=$40,0003=$13,333.33\mu = \frac{\$9,000 + \$10,000 + \$21,000}{3} = \frac{\$40,000}{3} = \$13,333.33

  • Example #2: Sales prices: $8,000, $9,000, $10,000, $41,000\$8,000, \ \$9,000, \ \$10,000, \ \$41,000. For N=4N=4.

    • μ=$8,000+$9,000+$10,000+$41,0004=$68,0004=$17,000\mu = \frac{\$8,000 + \$9,000 + \$10,000 + \$41,000}{4} = \frac{\$68,000}{4} = \$17,000

Sample Mean (xˉ\bar{x})
  • A statistic.

  • Formula: xˉ=<em>i=1nx</em>in\bar{x} = \frac{\sum<em>{i=1}^{n} x</em>i}{n}

    • xˉ\bar{x}: Sample mean (pronounced "x-bar")

    • nn: Sample size

    • xix_i: ithi^{th} individual value of the variable

  • Example #1: Cars sold daily: 28,12,6,4,5,15,1028, 12, 6, 4, 5, 15, 10. For n=7n=7.

    • xˉ=28+12+6+4+5+15+107=80711.43\bar{x} = \frac{28+12+6+4+5+15+10}{7} = \frac{80}{7} \approx 11.43 cars

Median
  • Another center measure, less impacted by outliers than the mean.

  • Divides a data array into two equal halves.

  • =median(range) in Excel.

  • To Compute the Median:

    1. Sort the data in ascending order.

    2. Calculate the median's index (ii).

      • For population data index (NN): i=rac12Ni = rac{1}{2}N

      • For sample data index (nn): i=rac12ni = rac{1}{2}n

    3. If index ii is not an integer: Round up to the next integer. The median is the value at this rounded index position.

    4. If index ii is an integer: The median is the average of the values in index positions ii and i+1i+1.

  • Example #1 (Odd N): Prices: $9,000, $10,000, $21,000\$9,000, \ \$10,000, \ \$21,000. (N=3N=3)

    1. Sorted: $9,000, $10,000, $21,000\$9,000, \ \$10,000, \ \$21,000

    2. Index: i=12(3)=1.5i = \frac{1}{2}(3) = 1.5. Round up to 22.

    3. Median is the 2nd2^{nd} value: $10,000\$10,000.

  • Example #2 (Odd N): Cars sold: 4,5,6,10,12,15,284, 5, 6, 10, 12, 15, 28. (Sorted, n=7n=7)

    1. Index: i=12(7)=3.5i = \frac{1}{2}(7) = 3.5. Round up to 44.

    2. Median is the 4th4^{th} value: 1010 cars.

  • Example #3 (Even N): Cars sold: 4,5,6,10,12,15,28,5004, 5, 6, 10, 12, 15, 28, 500. (Sorted, n=8n=8)

    1. Index: i=12(8)=4i = \frac{1}{2}(8) = 4. (Integer)

    2. Median is the average of values at positions 44 and 55.

    3. Values: 1010 and 1212. Median =10+122=11= \frac{10+12}{2} = 11 cars.

Using Mean and Median Together
  • Comparing the mean and median can reveal insights about the dataset's distribution, especially regarding extreme values:

    • Mean > Median: The dataset is skewed to the right (positive skew), indicating the presence of high extreme values.

    • Mean < Median: The dataset is skewed to the left (negative skew), indicating the presence of low extreme values.

    • Mean = Median: The dataset is evenly spread and symmetric.

Mode
  • Another measure of central location, though not as common as mean or median.

  • Counts the most frequent value(s) in the dataset.

  • A dataset can have multiple modes (bimodal, multimodal) or no mode.

  • Note: The mode need not be in the center and may not always reflect the center of the data set; interpretation requires care.

  • Example #1 (Single Mode): Daily car sales: 3,3,4,5,7,7,7,9,10,10,11,11,13,13,153, 3, 4, 5, 7, 7, 7, 9, 10, 10, 11, 11, 13, 13, 15

    • The value 77 appears 3 times, which is more than any other value. Mode = 77.

  • Example #2 (Multiple Modes): Song plays: 0,0,1,1,1,2,3,5,5,50, 0, 1, 1, 1, 2, 3, 5, 5, 5

    • The values 11 and 55 both appear 3 times. Modes = 1,51, 5.

  • Center Considerations:

    • Datasets with no value(s) occurring more frequently than others have no mode.

    • The mode may not necessarily be close to the mean or median (e.g., for dataset 3,3,3,3,3,8,9,12,16,16,16{3, 3, 3, 3, 3, 8, 9, 12, 16, 16, 16}, mode is 3, median is 8, mean is 8.4).

Other Location Measures

Weighted Mean
  • Used when some data values are more important or occur more frequently than others.

  • Also known as weighted average.

Weighted Mean for a Population (μw\mu_w)
  • Formula: μ<em>w=w</em>ix<em>iw</em>i\mu<em>w = \frac{\sum w</em>i x<em>i}{\sum w</em>i}

    • wiw_i: The weight of the ithi^{th} data value

    • xix_i: The ithi^{th} data value

Weighted Mean for a Sample (xˉw\bar{x}_w)
  • Formula: xˉ<em>w=w</em>ix<em>iw</em>i\bar{x}<em>w = \frac{\sum w</em>i x<em>i}{\sum w</em>i}

    • wiw_i: The weight of the ithi^{th} data value

    • xix_i: The ithi^{th} data value

  • Example #1 (Income):

    • Worker 1: 50 days, $1000\$1000. w<em>1=50,x</em>1=1000w<em>1 = 50, x</em>1 = 1000

    • Worker 2: 20 days, $500\$500. w<em>2=20,x</em>2=500w<em>2 = 20, x</em>2 = 500

    • Worker 3: 30 days, $800\$800. w<em>3=30,x</em>3=800w<em>3 = 30, x</em>3 = 800

    • xˉw=(50×$1000)+(20×$500)+(30×$800)50+20+30=$50,000+$10,000+$24,000100=$84,000100=$840\bar{x}_w = \frac{(50 \times \$1000) + (20 \times \$500) + (30 \times \$800)}{50+20+30} = \frac{\$50,000 + \$10,000 + \$24,000}{100} = \frac{\$84,000}{100} = \$840

  • Example #2 (Antique Store):

    • Cust 1: 5 items, $100\$100. w<em>1=5,x</em>1=100w<em>1 = 5, x</em>1 = 100

    • Cust 2: 3 items, $50\$50. w<em>2=3,x</em>2=50w<em>2 = 3, x</em>2 = 50

    • Cust 3: 4 items, $80\$80. w<em>3=4,x</em>3=80w<em>3 = 4, x</em>3 = 80

    • xˉw=(5×$100)+(3×$50)+(4×$80)5+3+4=$500+$150+$32012=$97012$80.83\bar{x}_w = \frac{(5 \times \$100) + (3 \times \$50) + (4 \times \$80)}{5+3+4} = \frac{\$500 + \$150 + \$320}{12} = \frac{\$970}{12} \approx \$80.83

Percentiles
  • Bin data into equally spaced percentage bins.

  • Can answer questions like: