ISL2441 Statistics for Management I - Describing Data

Page 2: Overview of Numeric Descriptive Statistics

Measures of Central Tendency and Location:
- Mean (Arithmetic Mean)
- Median
- Mode
- Geometric Mean
- Weighted Mean
Measures of Dispersion (Variability):
- Quartiles
- Percentiles
- Range
- Interquartile Range
- Variance
- Standard Deviation
- Coefficient of Variation
Measures of Shape:
- Skewness
- Kurtosis

Page 3: Measures of Central Tendency

Population Mean (µ):
- Formula: (ar{X} = \frac{\sum_{i=1}^N x_i}{N})
- (N) = Population size
- (x_i) = ith value of variable X
Sample Mean ((\bar{X})):
- Formula: (\bar{X} = \frac{\sum_{i=1}^n x_i}{n})
- (n) = Sample size

Page 4: Example of Mean Calculation

Example (Lind et al., 2021):
- Verizon study on mobile phone usage.
- Data: Daily usage in hours from 12 customers
- Calculation:
  - Total Usage = 4.1 + 3.7 + 4.3 + 4.2 + 5.5 + 5.1 + 4.2 + 5.1 + 4.2 + 4.6 + 5.2 + 3.8 = 54.0 hours
  - Mean = (\frac{54.0}{12} = 4.5) hours

Page 5: Understanding the Mean

Characteristics of Mean:
- Only usable for interval or ratio data
- Sensitive to outliers
- Implies that all values are included in the calculation
- Unique: There is only one mean for a given dataset
- Deviations from the mean sum to zero: (\sum (x_i - \bar{X}) = 0)

Page 6: Median Calculation

Definition: The median is the middle value in an ordered dataset.
Calculation Rule:
- For odd observations: Median = middle value
- For even observations: Median = average of the two middle values
- Example: Order values: 3.7, 4.1, 4.2, 4.2, 4.3, 5.1, 5.5
- Median = 4.2

Page 7: Mode Calculation

Definition: The mode is the most frequently appearing value in a dataset.
Applications: Useful for nominal data, and determining the frequency of occurrences.
Example:
- Server failures: 1, 3, 0, 3, 26, 2, 7, 4, 0, 2, 3, 3, 6, 3
- Mode = 3

Page 8: Additional Mode Examples

Modes can be absent (no value repeats), or multiple (bimodal or multimodal).
Example data of system failures showed frequencies of different failure rates.

Page 9: Frequency Distribution

Example: Delay complaints and their frequency.
Mean calculated as weighted average based on frequencies.
Mean: (\bar{X} = \frac{\sum (f_i \cdot x_i)}{n})

Page 10: Numerical Measures in Frequency Distribution

Formulas:
- Mean = (\bar{X} = \frac{\sum f_i imes m_i}{\sum f_i})
- Median = Calculation involves interval values and cumulative frequencies
- Mode calculation based on frequency intervals.

Page 11: Class Limits

Example of Data Classes with Frequency Distribution:
- ...

Page 12: Geometric Mean

Definition: Used to compute average growth rates.
Formula: (\bar{X}_G = \sqrt[n]{x_1 \cdot x_2 \cdot ... \cdot x_n})
Example Calculation: Growth rates leading to the geometric mean calculation.

Page 13: Weighted Mean

General Formula: (\bar{X} = \frac{\sum w_i x_i}{\sum w_i})
Example: Weighted grades from different assessments.
Conclusion: The weighted mean reflects the outcome based on the importance (weight) of each assessment.

Page 14: Example of Weighted Mean Calculation

Scenario: Total return from different mutual funds.
Calculate average total return using weighted mean approach.

Page 15: Calculating Mean Cost

Example: Raw material purchases over time to calculate average cost per pound using weighted mean.

Page 16: Understanding Quartiles

Definition: Quartiles are measures that divide a dataset into quarters.
- Q1: Lower quartile
- Q2: Median
- Q3: Upper quartile

Page 17: Quartile Positions

Calculation of quartile positions based on ordered data values.
- Formulas for Q1, Q2, Q3.
Example: Interpreting values for Q1 and Q3.

Page 18: Example of Quartile Calculation

Daily electricity consumption example to calculate Q1, Q2, Q3 for the data set.

Page 19: Understanding Percentiles

Definition: Percentiles divide data into 100 equal parts.
Procedure for Calculation:
- Find percentile position for desired percentile rank.

Page 20: Five-number Summary

A fundamental method for describing datasets including Min, Max, Q1, Median (Q2), Q3.

Page 21: Dispersion Measures

Definitions and importance of measures in analyzing data dispersion:
- Range
- Interquartile Range (IQR)
- Variance
- Standard Deviation
- Coefficient of Variation

Page 22: Example of Dispersion Analysis

Case study of supplier reliability compared using dispersion analysis.

Page 23: Dispersion Visualization

Figures representing varying spreads with identical means.

Page 24: Understanding Range and IQR

Formula for Range: (R = X_{max} - X_{min})
Purpose of IQR: Measure spread of the central 50% of data.

Page 25: Variance

Variance definition and its calculation methods.
Population Variance Formula: (σ^2 = \frac{\Sigma_{i=1}^{N} (x_i - µ)^2}{N})
Sample Variance Formula: (s^2 = \frac{\Sigma_{i=1}^{n} (x_i - \bar{X})^2}{n - 1})

Page 26: Standard Deviation

Definition: Measure of average deviation from the mean.
Population and sample standard deviation formulas presented.

Page 27: Variance and Standard Deviation Calculation Example

Perform variance and standard deviation calculation with data values.

Page 28: Continued Calculation of Variance and Standard Deviation

Detailed example demonstrating the process.

Page 29: Coefficient of Variation (CV)

Definition and importance of CV in expressing variability.
Formulas for Population and Sample CV:
- Population: (CV = \frac{σ}{µ} \times 100)
- Sample: (CV = \frac{s}{\bar{x}} \times 100)

Page 30: CV Example Analysis

Example of comparing stock price variability between two companies using CV.

Page 31: Performance Comparison

Case study comparing assembly line performances using mean and standard deviation.

Page 32: Supplier Reliability Exercise

Analysis of delivery times for two suppliers to ascertain reliability.

Page 33: Exploring Data Shape

Importance of understanding data shape in descriptive statistics.
Measures of shape: skews or peakedness of distribution.

Page 34: Skewness

Definition and importance in understanding distribution symmetry.

Page 35: Kurtosis

Defines the 'peakedness' or flatness of a distribution.

Page 36: Skewness and Kurtosis in Normal Distribution

In a normally distributed dataset, both skewness and kurtosis equal 0, indicating symmetry and standard bell shape.

Page 37: Properties of Normal Distribution

Discusses characteristics of the normal distribution: bell shape, symmetric, and asymptotic behavior.

Page 38: Measures of Central Tendency in Normal Distribution

Relations between mean, median, and mode in symmetric distributions.

Page 39: Recap of Numeric Descriptive Statistics

Summary of key statistical measures and their categories.

Page 40: References

Textbook References:
- Various sources cited, all supporting foundational statistical concepts studied in the course.