IM

Measures of Central Tendency Lecture Notes

Measures of Central Tendency Lecture Notes

Introduction

  • Lecturer: Nneka Ibekwe-Okafor, PhD

  • Affiliation: The University of Texas at Austin

  • Focus Topic: Measures of Central Tendency

Agenda

  • Review of Previous Chapter

  • Chapter Three: Measures of Central Tendency

    • The Mode

    • The Median

    • The Mean

    • The Shape of the Distribution

    • Selecting Measure of Central Tendency

  • Looking Ahead

Previous Chapters

  • Content Review:

    • Importance of understanding types of variables and their levels of measurement.

Describing Variables

  • Types of Variables:

    • Categorical: Variables divided into categories.

    • Numeric: Variables that are quantifiable.

  • Levels of Measurement:

    • Nominal: No order (e.g., race, gender, marital status).

    • Ordinal: Ordered categories (e.g., education level).

    • Interval: Numeric scales without a true zero (e.g., temperature).

    • Ratio: Numeric scales with a true zero (e.g., income).

Categorical Variables

  • Nominal Examples: Race, gender, and marital status. Fixed categories without order.

  • Ordinal Examples: Education level, socioeconomic status, Likert scale responses (e.g., agreement level).

Numeric Variables

  • Interval vs. Ratio:

    • Interval: Does not have a true zero (e.g., temperature).

    • Ratio: Has a true zero (e.g., income).

    • Both types allow for mean, SD calculation.

  • Example Numeric Variables:

    • Age, height, weight, income, temperature, test scores.

Data Distribution

  • Definition: to analyze how often each value appears using frequency, proportion, and percentage.

Individual Practice (Survey Example)

  • Surveyed 2,000 individuals. Exploring engagement with different racial groups with various question types:

    • Categorical Types (Nominal and Ordinal).

    • Numeric Types (Interval and Ratio).

Hands-on Example

  • Calculating proportion and percentage for responses regarding engagement frequency with individuals of different races:

    • Response categories: Always, Often, Sometimes, Rarely, Never.

Excel Example for Data Analysis

  • Steps to calculate total, proportions, and percentages in Excel:

    • Add headers (total, proportions), and use formulas:

    • Total: =SUM(range)

    • Proportion: =f/N

    • Percentage: =f/N * 100

Measures of Central Tendency

  • Definition: Numbers that summarize what is average or typical in data distribution.

  • Types:

    • Mode: Most frequently occurring value in a dataset.

    • Median: Middle value that divides a distribution.

    • Mean: Arithmetic average of all values.

Purpose of Measuring Central Tendency

  • To describe and summarize larger datasets efficiently.

  • Factors influencing the choice of measure:

    • Level of measurement

    • Shape of distribution

    • Research objective.

The Mode

  • Definition: The mode is the category or score that occurs most frequently.

    • Can be used with any measurement type.

    • Identifies the most common category in a dataset.

  • Examples:

    • Numeric: A dataset like 5, 23, 6, 9, where mode = 5.

    • Categorical: Number of speakers of different languages.

Limitations of the Mode

  • Does not capture the overall distribution of data.

  • May provide a misleading representation if diverse classes are present.

The Median

  • Definition: The median divides the dataset into two equal parts.

    • Suitable for ordered categories (ordinal, interval, ratio).

    • Calculation varies based on odd/even number of observations.

  • Finding the Median:

    • Sort data in ascending order.

    • If odd (N=7): the median is the middle number.

    • If even (N=8): the median is the average of the two middle numbers.

Median Calculation Example

  • Data: 7, 3, 8, 11, 5, 19, sorted: 3, 4, 5, 7, 8, 10, 11, 19 -> Median = (7+8)/2 = 7.5.

Limitations of the Median

  • Ignores the values of outliers.

  • Less reliable with small sample sizes.

The Mean

  • Definition: The mean is calculated by adding all values together and dividing by the number of observations.

  • Important Properties:

    • Sensitive to all data points. Good for normally distributed data without outliers.

    • Sensitive to extremes.

  • Formula for Mean:
    ar{Y} = \frac{\sum Y}{N}

Mean Calculation Example

  • Data example: 2, 2, 2 vs. 1, 5, 10 shows how outliers affect the mean.

Comparing Central Tendency Measures

  • Mean and median equal for symmetrical distributions.

  • Median is robust against skew and outliers.

Shape of the Distribution

  • Modal and Skewness:

    • Unimodal, bimodal, multimodal.

    • Positive and negative skewness effects on mean and median.

  • Discussing how distributions aren't always symmetrical.

Choosing a Measure of Central Tendency

  • Skewed Distributions: Prefer median or mode over mean due to influence of outliers.

  • Symmetrical Distributions: Any measure can be used, but mean is often preferred for its comprehensive information.

Controversies Around Measures of Central Tendency

  • Data Aggregation: Can mask disparities when aggregated across different groups.

  • Data Interpretation: Stereotypes may be perpetuated by presenting averages without context.

  • Cultural Differences: Different interpretations based on cultural contexts.

Key Takeaways

  • Mean: Average; sensitive to outliers.

  • Median: Middle value; robust to outliers.

  • Mode: Most frequent value; useful for categorical data.

  • Interpret measures cautiously, ensuring consideration of context and application.

Looking Ahead

  • Upcoming classes, labs, and assignments for next week.

  • Recommended readings from Chapter 4 about Measures of Variability.

Questions

  • Contact: Nneka Ibekwe-Okafor at niokafor@utexas.edu.