Descriptive vs. Inferential Statistics

Descriptive vs. Inferential Statistics

Introduction

  • The science of statistics is divided into two main categories:
    • Descriptive statistics
    • Inferential statistics

Descriptive Statistics

  • Descriptive methods are used to describe and summarize data.
  • Emphasis is on analyzing observed measurements, typically from a sample.
  • Key questions addressed:
    • What is the typical value for the measurements?
    • How much variation exists within the measurements?
    • What is the shape or distribution of the measurements?
    • Are there any extreme values, and what do they indicate?
    • If two variables are present, what kind of relationship exists and how strong is it?
  • Importance:
    • With large datasets, inspecting all data values directly is impractical for gaining useful knowledge.
    • Data must be summarized to be comprehended.
  • Definition: Descriptive statistics involves the collection, organization, analysis, and presentation of data.
  • Examples:
    • Frequency distribution.
    • Measures of central tendency:
      • Mean.
      • Median.
      • Mode.
    • Measures of dispersion (variation):
      • Range.
      • Variance. Denoted as \sigma^2 . It measures the spread of data points around the mean.
      • Standard deviation. Denoted as \sigma . It is the square root of the variance and provides a measure of the average distance of data points from the mean.

Inferential Statistics

  • It would be preferable to have measurements from the entire population, but this is often not obtainable or too costly.
  • Example: Safety data collection via crash testing does not involve testing every car produced.
  • Definition: Inferential statistics aims to make reasonable estimates about population characteristics using sample data.
  • Process:
    • A large population (with unknown characteristics called parameters) is analyzed.
    • Due to the infeasibility or high cost of obtaining data from the entire population, a sample (a smaller subset) is examined.

Visual Representation

  • Population: Represented as a large blue oval with unknown characteristics (parameters).
  • Sample: A small green oval, which is a subset of the population, used to infer population characteristics.

Example: Michelin Tire Company

  • Context: Michelin uses a feature called Track Connect to help race car drivers maximize tire performance.
  • Method: The Track Connect app provides personalized advice on optimal tire pressure and temperature.
  • Experiment: Michelin collected data from 30 drivers on various tracks and conditions to evaluate the app.
  • Data collected: Air pressure, temperature, and gas mileage.
  • Conclusion: Tires lasted longer and cars achieved better gas mileage.
  • Type of statistics: Inferential statistics.
  • Explanation: Michelin used data from a sample (30 drivers) to make conclusions about all race car drivers.