ISL2441 Statistics for Management I - Describing Data
Page 2: Overview of Numeric Descriptive Statistics
Measures of Central Tendency and Location:
Mean (Arithmetic Mean)
Median
Mode
Geometric Mean
Weighted Mean
Measures of Dispersion (Variability):
Quartiles
Percentiles
Range
Interquartile Range
Variance
Standard Deviation
Coefficient of Variation
Measures of Shape:
Skewness
Kurtosis
Page 3: Measures of Central Tendency
Population Mean (µ):
Formula: (ar{X} = \frac{\sum_{i=1}^N x_i}{N})
(N) = Population size
(x_i) = ith value of variable X
Sample Mean ((\bar{X})):
Formula: (\bar{X} = \frac{\sum_{i=1}^n x_i}{n})
(n) = Sample size
Page 4: Example of Mean Calculation
Example (Lind et al., 2021):
Verizon study on mobile phone usage.
Data: Daily usage in hours from 12 customers
Calculation:
Total Usage = 4.1 + 3.7 + 4.3 + 4.2 + 5.5 + 5.1 + 4.2 + 5.1 + 4.2 + 4.6 + 5.2 + 3.8 = 54.0 hours
Mean = (\frac{54.0}{12} = 4.5) hours
Page 5: Understanding the Mean
Characteristics of Mean:
Only usable for interval or ratio data
Sensitive to outliers
Implies that all values are included in the calculation
Unique: There is only one mean for a given dataset
Deviations from the mean sum to zero: (\sum (x_i - \bar{X}) = 0)
Page 6: Median Calculation
Definition: The median is the middle value in an ordered dataset.
Calculation Rule:
For odd observations: Median = middle value
For even observations: Median = average of the two middle values
Example: Order values: 3.7, 4.1, 4.2, 4.2, 4.3, 5.1, 5.5
Median = 4.2
Page 7: Mode Calculation
Definition: The mode is the most frequently appearing value in a dataset.
Applications: Useful for nominal data, and determining the frequency of occurrences.
Example:
Server failures: 1, 3, 0, 3, 26, 2, 7, 4, 0, 2, 3, 3, 6, 3
Mode = 3
Page 8: Additional Mode Examples
Modes can be absent (no value repeats), or multiple (bimodal or multimodal).
Example data of system failures showed frequencies of different failure rates.
Page 9: Frequency Distribution
Example: Delay complaints and their frequency.
Mean calculated as weighted average based on frequencies.
Mean: (\bar{X} = \frac{\sum (f_i \cdot x_i)}{n})
Page 10: Numerical Measures in Frequency Distribution
Formulas:
Mean = (\bar{X} = \frac{\sum f_i imes m_i}{\sum f_i})
Median = Calculation involves interval values and cumulative frequencies
Mode calculation based on frequency intervals.
Page 11: Class Limits
Example of Data Classes with Frequency Distribution:
...
Page 12: Geometric Mean
Definition: Used to compute average growth rates.
Formula: (\bar{X}_G = \sqrt[n]{x_1 \cdot x_2 \cdot ... \cdot x_n})
Example Calculation: Growth rates leading to the geometric mean calculation.
Page 13: Weighted Mean
General Formula: (\bar{X} = \frac{\sum w_i x_i}{\sum w_i})
Example: Weighted grades from different assessments.
Conclusion: The weighted mean reflects the outcome based on the importance (weight) of each assessment.
Page 14: Example of Weighted Mean Calculation
Scenario: Total return from different mutual funds.
Calculate average total return using weighted mean approach.
Page 15: Calculating Mean Cost
Example: Raw material purchases over time to calculate average cost per pound using weighted mean.
Page 16: Understanding Quartiles
Definition: Quartiles are measures that divide a dataset into quarters.
Q1: Lower quartile
Q2: Median
Q3: Upper quartile
Page 17: Quartile Positions
Calculation of quartile positions based on ordered data values.
Formulas for Q1, Q2, Q3.
Example: Interpreting values for Q1 and Q3.
Page 18: Example of Quartile Calculation
Daily electricity consumption example to calculate Q1, Q2, Q3 for the data set.
Page 19: Understanding Percentiles
Definition: Percentiles divide data into 100 equal parts.
Procedure for Calculation:
Find percentile position for desired percentile rank.
Page 20: Five-number Summary
A fundamental method for describing datasets including Min, Max, Q1, Median (Q2), Q3.
Page 21: Dispersion Measures
Definitions and importance of measures in analyzing data dispersion:
Range
Interquartile Range (IQR)
Variance
Standard Deviation
Coefficient of Variation
Page 22: Example of Dispersion Analysis
Case study of supplier reliability compared using dispersion analysis.
Page 23: Dispersion Visualization
Figures representing varying spreads with identical means.
Page 24: Understanding Range and IQR
Formula for Range: (R = X_{max} - X_{min})
Purpose of IQR: Measure spread of the central 50% of data.
Page 25: Variance
Variance definition and its calculation methods.
Population Variance Formula: (σ^2 = \frac{\Sigma_{i=1}^{N} (x_i - µ)^2}{N})
Sample Variance Formula: (s^2 = \frac{\Sigma_{i=1}^{n} (x_i - \bar{X})^2}{n - 1})
Page 26: Standard Deviation
Definition: Measure of average deviation from the mean.
Population and sample standard deviation formulas presented.
Page 27: Variance and Standard Deviation Calculation Example
Perform variance and standard deviation calculation with data values.
Page 28: Continued Calculation of Variance and Standard Deviation
Detailed example demonstrating the process.
Page 29: Coefficient of Variation (CV)
Definition and importance of CV in expressing variability.
Formulas for Population and Sample CV:
Population: (CV = \frac{σ}{µ} \times 100)
Sample: (CV = \frac{s}{\bar{x}} \times 100)
Page 30: CV Example Analysis
Example of comparing stock price variability between two companies using CV.
Page 31: Performance Comparison
Case study comparing assembly line performances using mean and standard deviation.
Page 32: Supplier Reliability Exercise
Analysis of delivery times for two suppliers to ascertain reliability.
Page 33: Exploring Data Shape
Importance of understanding data shape in descriptive statistics.
Measures of shape: skews or peakedness of distribution.
Page 34: Skewness
Definition and importance in understanding distribution symmetry.
Page 35: Kurtosis
Defines the 'peakedness' or flatness of a distribution.
Page 36: Skewness and Kurtosis in Normal Distribution
In a normally distributed dataset, both skewness and kurtosis equal 0, indicating symmetry and standard bell shape.
Page 37: Properties of Normal Distribution
Discusses characteristics of the normal distribution: bell shape, symmetric, and asymptotic behavior.
Page 38: Measures of Central Tendency in Normal Distribution
Relations between mean, median, and mode in symmetric distributions.
Page 39: Recap of Numeric Descriptive Statistics
Summary of key statistical measures and their categories.
Page 40: References
Textbook References:
Various sources cited, all supporting foundational statistical concepts studied in the course.