Descriptive Statistics Notes
Descriptive Statistics Overview
- Descriptive Statistics: A branch of statistics that provides summaries about the sample and measures.
Key Terms to Understand
- Central Tendency: Indicates the average or center of a data set.
- Variation: Describes how much the data varies or spreads out.
- Skewness: Measures asymmetry of the probability distribution of a real-valued random variable.
- Kurtosis: Measures the tailedness of the probability distribution.
Learning Objectives
Know
- Definitions of central tendency, variation, skewness, and kurtosis.
- Three principal measures of central tendency (mean, median, mode) and their application.
- Types of mean (arithmetic, weighted, geometric, harmonic).
Understand
- Importance of calculating summary statistics.
- Computation of the following measurements:
- Arithmetic mean
- Weighted mean
- Geometric mean
- Harmonic mean
- Range
- Mean deviation
- Variance
- Standard deviation
- Coefficient of Variation
- Impact of skewness on the relative positions of the three measures of central tendency.
Be Able to
- Compute various measures of central tendency and variation.
- Create frequency distributions and histograms.
- Assess skewness and kurtosis within data distributions.
Measures of Central Tendency
Mean
Arithmetic Mean: Sum of all observations divided by the number of observations.
Formula: ar{Y} = rac{1}{n} imes extstyle{igg( extstyle{igg( extstyle{igg( Y1 + Y2 + ar{Y} + … + Y_n} igg) } igg) } igg)}- Excel Command:
=AVERAGE(A1:An)
- Excel Command:
Weighted Mean: The mean calculated by giving different weights to different observations.
Formula:Geometric Mean: Use when values have multiplicative relationships.
Formula:- Excel Command:
=GEOMEAN(A1:An)
- Excel Command:
Harmonic Mean: Useful for rates. Formula:
- Excel Command:
=HARMEAN(A1:An)
- Excel Command:
Median: The middle value when data is arranged in order.
- Excel Command:
=MEDIAN(A1:An)
- Excel Command:
Mode: The most frequently occurring value in a distribution.
- Excel Command:
=MODE(A1:An)
- Excel Command:
Relationship Between Shape of Distribution and Averages
- In a symmetric distribution:
- Mean = Median = Mode
- In Negatively Skewed Distribution:
- Mean < Median < Mode
- In Positively Skewed Distribution:
- Mean > Median > Mode
Measures of Dispersion
- Range: The difference between the highest and lowest value.
Formula: - Interquartile Range (IQR): Difference between the 3rd quartile (Q3) and 1st quartile (Q1).
Formula: - Mean Deviation: The average of the absolute deviations from the mean.
Formula: - Variance: The average squared deviation from the mean.
Formula:
- Excel Command:
=VAR(A1:An)
- Excel Command:
- Standard Deviation: The square root of variance.
Formula:
- Excel Command:
=STDEV(A1:An)
- Excel Command:
- Coefficient of Variation (CV): Ratio of the standard deviation to the mean.
Formula:
Assessing Shape of Distribution
- Skewness and Kurtosis: Measure of asymmetry and peakedness of the distribution.
- Skewness range: -∞ to +∞.
- Positive: Right-skewed
- Negative: Left-skewed
- Zero: Symmetric
- Kurtosis range: 1 to ∞.
- Leptokurtic: Peaked distribution
- Platykurtic: Flat distribution
- Mesokurtic: Normal distribution (Kurtosis = 3)