Measures of Central Tendency Lecture Notes
Measures of Central Tendency Lecture Notes
Introduction
Lecturer: Nneka Ibekwe-Okafor, PhD
Affiliation: The University of Texas at Austin
Focus Topic: Measures of Central Tendency
Agenda
Review of Previous Chapter
Chapter Three: Measures of Central Tendency
The Mode
The Median
The Mean
The Shape of the Distribution
Selecting Measure of Central Tendency
Looking Ahead
Previous Chapters
Content Review:
Importance of understanding types of variables and their levels of measurement.
Describing Variables
Types of Variables:
Categorical: Variables divided into categories.
Numeric: Variables that are quantifiable.
Levels of Measurement:
Nominal: No order (e.g., race, gender, marital status).
Ordinal: Ordered categories (e.g., education level).
Interval: Numeric scales without a true zero (e.g., temperature).
Ratio: Numeric scales with a true zero (e.g., income).
Categorical Variables
Nominal Examples: Race, gender, and marital status. Fixed categories without order.
Ordinal Examples: Education level, socioeconomic status, Likert scale responses (e.g., agreement level).
Numeric Variables
Interval vs. Ratio:
Interval: Does not have a true zero (e.g., temperature).
Ratio: Has a true zero (e.g., income).
Both types allow for mean, SD calculation.
Example Numeric Variables:
Age, height, weight, income, temperature, test scores.
Data Distribution
Definition: to analyze how often each value appears using frequency, proportion, and percentage.
Individual Practice (Survey Example)
Surveyed 2,000 individuals. Exploring engagement with different racial groups with various question types:
Categorical Types (Nominal and Ordinal).
Numeric Types (Interval and Ratio).
Hands-on Example
Calculating proportion and percentage for responses regarding engagement frequency with individuals of different races:
Response categories: Always, Often, Sometimes, Rarely, Never.
Excel Example for Data Analysis
Steps to calculate total, proportions, and percentages in Excel:
Add headers (total, proportions), and use formulas:
Total:
=SUM(range)
Proportion:
=f/N
Percentage:
=f/N * 100
Measures of Central Tendency
Definition: Numbers that summarize what is average or typical in data distribution.
Types:
Mode: Most frequently occurring value in a dataset.
Median: Middle value that divides a distribution.
Mean: Arithmetic average of all values.
Purpose of Measuring Central Tendency
To describe and summarize larger datasets efficiently.
Factors influencing the choice of measure:
Level of measurement
Shape of distribution
Research objective.
The Mode
Definition: The mode is the category or score that occurs most frequently.
Can be used with any measurement type.
Identifies the most common category in a dataset.
Examples:
Numeric: A dataset like 5, 23, 6, 9, where mode = 5.
Categorical: Number of speakers of different languages.
Limitations of the Mode
Does not capture the overall distribution of data.
May provide a misleading representation if diverse classes are present.
The Median
Definition: The median divides the dataset into two equal parts.
Suitable for ordered categories (ordinal, interval, ratio).
Calculation varies based on odd/even number of observations.
Finding the Median:
Sort data in ascending order.
If odd (N=7): the median is the middle number.
If even (N=8): the median is the average of the two middle numbers.
Median Calculation Example
Data: 7, 3, 8, 11, 5, 19, sorted: 3, 4, 5, 7, 8, 10, 11, 19 -> Median = (7+8)/2 = 7.5.
Limitations of the Median
Ignores the values of outliers.
Less reliable with small sample sizes.
The Mean
Definition: The mean is calculated by adding all values together and dividing by the number of observations.
Important Properties:
Sensitive to all data points. Good for normally distributed data without outliers.
Sensitive to extremes.
Formula for Mean:
ar{Y} = \frac{\sum Y}{N}
Mean Calculation Example
Data example: 2, 2, 2 vs. 1, 5, 10 shows how outliers affect the mean.
Comparing Central Tendency Measures
Mean and median equal for symmetrical distributions.
Median is robust against skew and outliers.
Shape of the Distribution
Modal and Skewness:
Unimodal, bimodal, multimodal.
Positive and negative skewness effects on mean and median.
Discussing how distributions aren't always symmetrical.
Choosing a Measure of Central Tendency
Skewed Distributions: Prefer median or mode over mean due to influence of outliers.
Symmetrical Distributions: Any measure can be used, but mean is often preferred for its comprehensive information.
Controversies Around Measures of Central Tendency
Data Aggregation: Can mask disparities when aggregated across different groups.
Data Interpretation: Stereotypes may be perpetuated by presenting averages without context.
Cultural Differences: Different interpretations based on cultural contexts.
Key Takeaways
Mean: Average; sensitive to outliers.
Median: Middle value; robust to outliers.
Mode: Most frequent value; useful for categorical data.
Interpret measures cautiously, ensuring consideration of context and application.
Looking Ahead
Upcoming classes, labs, and assignments for next week.
Recommended readings from Chapter 4 about Measures of Variability.
Questions
Contact: Nneka Ibekwe-Okafor at niokafor@utexas.edu.