Measures of Central Tendency
Measures of Central Tendency
Definition: Measures of central tendency summarize data by identifying a middle value that represents the entire dataset.
Mode
Definition: The mode is the most frequent score or observation in a dataset.
Applicability: Can be used for both numerical and categorical (nominal) data.
Example: In a grocery store produce section, the mode would be the type of fruit or vegetable that appears the most frequently (e.g., grapes vs. potatoes).
Key Point: Mode is the only measure applicable to nominal data.
Median
Definition: The median is the middle score in a dataset when it's arranged in order.
Calculation:
For odd numbers of scores: the middle score is the median.
For even numbers: average the two middle scores.
Example: With the dataset {1, 2, 3, 4, 5, 6}, the middle score is 3.5 when you remove one score, calculate the average of the middle two.
Properties: Always results in the same position in terms of data structure; resistant to outliers.
Use Cases: Best used in skewed datasets or when dealing with extreme values.
Mean
Definition: The mean is the average of all scores, obtained by adding them together and dividing by the number of scores.
Calculation: Sum all scores; divide by total score count.
Example: Given scores sum to 73 and there are 13 scores, the mean is approximately 5.6.
Problematic Cases: The mean is affected by extreme values (outliers).
Example: If one score is drastically high (e.g., 1000), the mean dramatically increases (e.g., from 5.6 to 81.8).
Use Cases: Generally best used when data is normally distributed but can misrepresent data with outliers or skewed distributions.
Situational Use of Measures
Mean: Represents all data points but can be misleading in skewed distributions or with outliers.
Median: Ideal for income reporting or when large disparities exist, as it remains unaffected by extreme scores.
Mode: Useful in categorical data and understanding most common or frequent outcomes in a dataset.
Trimmed Mean
Definition: Calculated by removing a certain percentage of the extreme data points and then finding the mean of the remaining values.
Use Cases: Often used in contexts like Olympic scoring to reduce bias from these extreme scores.
Conclusion
Limitations: No single measure of central tendency can depict data accurately when there is significant variation.
Illustration of Limitations: A scenario involving five men with drastically varying incomes showcases these issues:
Mean income misrepresents the group as it skews with one individual earning significantly more.
The median might only reflect the reality of one individual, not the group.
Final Thought: Understand the context and data distribution to choose the most effective measure of central tendency.