Measures of Central Tendency Study Notes

Measures of Central Tendency

Learning Objectives

  • By the end of this chapter, you will be able to:
    1. Explain the purposes of measures of central tendency and interpret the information they convey.
    2. Calculate, explain, and compare and contrast the mode, median, and mean.
    3. Explain the mathematical characteristics of the mean.
    4. Select an appropriate measure of central tendency according to the level of measurement and skew.

Introduction to Measures of Central Tendency

  • Frequency distributions, graphs, and charts summarize the overall shape of a distribution of scores efficiently.
  • Detailed information is often needed about the distribution:
    • Typical or average case statistics (e.g., "The average starting salary for social workers is $39,000 per year").
    • Variety or heterogeneity in the distribution (e.g., starting salaries range from $31,000 to $42,000).
  • The first type pertains to measures of central tendency (mode, median, mean), while the second pertains to measures of dispersion (covered in Chapter 4).
  • These measures reduce data to a single, comprehensible number.
  • Although they share a purpose, they differ significantly in definitions and derived values under different conditions.
  • Appropriate selection depends on both measurement level and research purpose.

The Mode

  • Definition: The mode is the value that occurs most frequently in a distribution.
    • Example: In the scores 58, 82, 82, 90, 98, the mode is 82.
  • Usefulness:
    • Especially relevant for nominal-level variables as it is the only measure of central tendency applicable at this level.
    • Example: The mode of religious affiliations in Table 3.1 is Protestant, which is the single largest category in a fictitious sample of 242 respondents.
Limitations of the Mode
  1. Distributions without a mode or with too many modes can diminish its usefulness.
    • Example: Test scores with modes at 55, 66, 78, 82, 90, and 97 complicate clarity.
  2. The modal score may not represent the center of the distribution effectively.
    • Example: If the mode is 93 in scores where most are concentrated below, it could misrepresent overall performance.

The Median

  • Definition: The median (Md) represents the middle score in a distribution, where half of the cases are above and half are below it.
    • Example: With family incomes, if the median is $45,000, half of families earn more and half earn less.
  • Calculation Method:
    1. Arrange scores in order.
    2. For odd N, identify the middle case directly.
    3. For even N, average the two middle scores.
  • Illustration:
    • For grades of 93, 87, 80, 75, 61:
    • Ordered: 93, 87, 80, 75, 61; Median is 80.
    • With added score of 1:
    • New total: 10, 10, 8, 7, 6, 5, 4, 1;
    • Median computed as (7 + 5)/2=6.
Limitations of the Median
  • Cannot be computed for nominal data as scores cannot be ranked.
  • Most suitable for ordinal or interval-ratio data, preferably ordinal.

Measures of Position: Percentiles, Deciles, Quartiles

  • Median is also a positional statistic. Other measures of position are:
    • Percentiles: Points below which a specific percentage of cases fall (e.g., 46th percentile indicates that 46% scored lower).
    • Deciles: Divide distribution into tenths (e.g., first decile is 10th percentile).
    • Quartiles: Divide into quarters (first quartile is 25th percentile, second 50th, and third 75th).
Finding Percentiles
  • Method:
    1. Arrange scores in order.
    2. Multiply N by percentile value.
    3. Round up and use that case’s score.
Mean
  • Definition: The mean (represented by $ar{X}$ or arithmetic average) is calculated by dividing the sum of scores by the count (N).

    • Example: Test scores of clients at a clinic to find mean. Total score of 109 for 10 clients gives mean of 10.9:

    (Xˉ=extTotalScoreN=10910=10.9)(\bar{X} = \frac{ ext{Total Score}}{N} = \frac{109}{10} = 10.9)

  • Characterization of Mean:

    • Balances all scores, acting as a fulcrum where the sums of deviations equal zero:

    (extSumof(XiXˉ)=0)( ext{Sum of } (X_i - \bar{X}) = 0)

    • Minimizes variation, following the least squares principle:

    (extMinimizedextvariation:extSumof(XiXˉ)2extisleast)( ext{Minimized } ext{variation}: ext{Sum of } (X_i - \bar{X})^2 ext{ is least})

    • Affected by extreme scores in distributions. Mean will not represent central tendency effectively if skewed.

Mathematical Characteristics of Mean

  1. Balances Scores:
    • Illustrative Example: For scores 65, 73, 77, 85, 90, the average is 78; sums of deviations to mean equate to zero.
  2. Minimizes Variation:
    • Failing to achieve least squares may give misleading average values; may skew due to outliers.
  3. Sensitivity to Extremes: Changes in extreme scores impact the mean, unlike mode and median.
    • Example: Adding an extreme score (3500) significantly alters mean without affecting median.
Relationships in Symmetry and Skew
  • When determining distribution shape:
    • Mean < Median indicates negative skew.
    • Mean > Median indicates positive skew.
  • The report from statistics can exploit mean and median implications in analyses.

Selecting Measures of Central Tendency

  • Important considerations:
    1. Level of Measurement:
    • Nominal: Mode; Ordinal: Median; Interval-Ratio: Mean (or Median if skewed).
    1. Contextual Information Importance:
  • Guidelines provide recommendations on selecting appropriate statistics for data representation.

Summary

  • Each measure (mode, median, mean) provides insights into data distribution, focusing on typical case representation.
  • Mode: Most common score for nominal variables.
  • Median: Represents central tendency for ordinal data, employed with skewed interval-ratio distributions.
  • Mean: The most common measure; effective for interval-ratio data, though skewed distributions might necessitate caution.

Problems

  • Exercises provided for calculating and interpreting measures of central tendency and implications of data skew.

Glossary

  • Mean: Arithmetic average; $ar{X}$ for sample, $ ext{m}$ for population.
  • Median: Value separating a distribution into halves.
  • Mode: Most frequent value in a dataset.
  • Percentile: Value below which a certain percentage falls.
  • Quartile: Values dividing data into quarters (25%, 50%, 75%).
  • Deciles: Values dividing data into tenths.
  • Skew: Distribution bias toward extreme scores.
  • $X_i$: Any specified score in a dataset.