Measures of Central Tendency Study Notes
Measures of Central Tendency
Learning Objectives
- By the end of this chapter, you will be able to:
- Explain the purposes of measures of central tendency and interpret the information they convey.
- Calculate, explain, and compare and contrast the mode, median, and mean.
- Explain the mathematical characteristics of the mean.
- Select an appropriate measure of central tendency according to the level of measurement and skew.
Introduction to Measures of Central Tendency
- Frequency distributions, graphs, and charts summarize the overall shape of a distribution of scores efficiently.
- Detailed information is often needed about the distribution:
- Typical or average case statistics (e.g., "The average starting salary for social workers is $39,000 per year").
- Variety or heterogeneity in the distribution (e.g., starting salaries range from $31,000 to $42,000).
- The first type pertains to measures of central tendency (mode, median, mean), while the second pertains to measures of dispersion (covered in Chapter 4).
- These measures reduce data to a single, comprehensible number.
- Although they share a purpose, they differ significantly in definitions and derived values under different conditions.
- Appropriate selection depends on both measurement level and research purpose.
The Mode
- Definition: The mode is the value that occurs most frequently in a distribution.
- Example: In the scores 58, 82, 82, 90, 98, the mode is 82.
- Usefulness:
- Especially relevant for nominal-level variables as it is the only measure of central tendency applicable at this level.
- Example: The mode of religious affiliations in Table 3.1 is Protestant, which is the single largest category in a fictitious sample of 242 respondents.
Limitations of the Mode
- Distributions without a mode or with too many modes can diminish its usefulness.
- Example: Test scores with modes at 55, 66, 78, 82, 90, and 97 complicate clarity.
- The modal score may not represent the center of the distribution effectively.
- Example: If the mode is 93 in scores where most are concentrated below, it could misrepresent overall performance.
The Median
- Definition: The median (Md) represents the middle score in a distribution, where half of the cases are above and half are below it.
- Example: With family incomes, if the median is $45,000, half of families earn more and half earn less.
- Calculation Method:
- Arrange scores in order.
- For odd N, identify the middle case directly.
- For even N, average the two middle scores.
- Illustration:
- For grades of 93, 87, 80, 75, 61:
- Ordered: 93, 87, 80, 75, 61; Median is 80.
- With added score of 1:
- New total: 10, 10, 8, 7, 6, 5, 4, 1;
- Median computed as (7 + 5)/2=6.
Limitations of the Median
- Cannot be computed for nominal data as scores cannot be ranked.
- Most suitable for ordinal or interval-ratio data, preferably ordinal.
Measures of Position: Percentiles, Deciles, Quartiles
- Median is also a positional statistic. Other measures of position are:
- Percentiles: Points below which a specific percentage of cases fall (e.g., 46th percentile indicates that 46% scored lower).
- Deciles: Divide distribution into tenths (e.g., first decile is 10th percentile).
- Quartiles: Divide into quarters (first quartile is 25th percentile, second 50th, and third 75th).
Finding Percentiles
- Method:
- Arrange scores in order.
- Multiply N by percentile value.
- Round up and use that case’s score.
Mean
Definition: The mean (represented by $ar{X}$ or arithmetic average) is calculated by dividing the sum of scores by the count (N).
- Example: Test scores of clients at a clinic to find mean. Total score of 109 for 10 clients gives mean of 10.9:
Characterization of Mean:
- Balances all scores, acting as a fulcrum where the sums of deviations equal zero:
- Minimizes variation, following the least squares principle:
- Affected by extreme scores in distributions. Mean will not represent central tendency effectively if skewed.
Mathematical Characteristics of Mean
- Balances Scores:
- Illustrative Example: For scores 65, 73, 77, 85, 90, the average is 78; sums of deviations to mean equate to zero.
- Minimizes Variation:
- Failing to achieve least squares may give misleading average values; may skew due to outliers.
- Sensitivity to Extremes: Changes in extreme scores impact the mean, unlike mode and median.
- Example: Adding an extreme score (3500) significantly alters mean without affecting median.
Relationships in Symmetry and Skew
- When determining distribution shape:
- Mean < Median indicates negative skew.
- Mean > Median indicates positive skew.
- The report from statistics can exploit mean and median implications in analyses.
Selecting Measures of Central Tendency
- Important considerations:
- Level of Measurement:
- Nominal: Mode; Ordinal: Median; Interval-Ratio: Mean (or Median if skewed).
- Contextual Information Importance:
- Guidelines provide recommendations on selecting appropriate statistics for data representation.
Summary
- Each measure (mode, median, mean) provides insights into data distribution, focusing on typical case representation.
- Mode: Most common score for nominal variables.
- Median: Represents central tendency for ordinal data, employed with skewed interval-ratio distributions.
- Mean: The most common measure; effective for interval-ratio data, though skewed distributions might necessitate caution.
Problems
- Exercises provided for calculating and interpreting measures of central tendency and implications of data skew.
Glossary
- Mean: Arithmetic average; $ar{X}$ for sample, $ ext{m}$ for population.
- Median: Value separating a distribution into halves.
- Mode: Most frequent value in a dataset.
- Percentile: Value below which a certain percentage falls.
- Quartile: Values dividing data into quarters (25%, 50%, 75%).
- Deciles: Values dividing data into tenths.
- Skew: Distribution bias toward extreme scores.
- $X_i$: Any specified score in a dataset.