QMSS Day 5 Shitty notes: Introduction Notes

Chapter 1: Introduction

  • Data as columns in a dataset: gender, age, hobby, etc. Each is a separate feature/column.
  • Example notion: percent of a subset within a whole (e.g., "I took 25% of the apple" implies a part/whole relationship). Note the idea of converting a portion into a percent.
  • Ratio vs. rates vs. percent:
    • Ratio: compares two quantities directly (e.g., juniors to seniors in a class is 1.5).
    • Rates/percent: relate changes or standardized quantities, useful for comparing across populations or over time.
    • Rates are used when you want to standardize across different population sizes to compare differences between populations.
  • When to use ratios vs. rates:
    • Use ratio when comparing groups within a dataset or categories across similar populations.
    • Use rates when comparing populations of different sizes or when expressing a change per a standard population unit.
  • Example illustrating rate calculation and interpretation:
    • Consider two cities with different populations. To understand the likelihood of a theft occurring, calculate a rate per population unit (e.g., per 100,000 people):
    • If City A has 237 thefts and a population of 237,000, the rate per 100,000 people can be computed as:
    • $$ ext{Rate} = rac{ ext{thefts}}{ ext{population}} imes 100{,}000 \