QMSS Day 5 Shitty notes: Introduction Notes
Chapter 1: Introduction
- Data as columns in a dataset: gender, age, hobby, etc. Each is a separate feature/column.
- Example notion: percent of a subset within a whole (e.g., "I took 25% of the apple" implies a part/whole relationship). Note the idea of converting a portion into a percent.
- Ratio vs. rates vs. percent:
- Ratio: compares two quantities directly (e.g., juniors to seniors in a class is 1.5).
- Rates/percent: relate changes or standardized quantities, useful for comparing across populations or over time.
- Rates are used when you want to standardize across different population sizes to compare differences between populations.
- When to use ratios vs. rates:
- Use ratio when comparing groups within a dataset or categories across similar populations.
- Use rates when comparing populations of different sizes or when expressing a change per a standard population unit.
- Example illustrating rate calculation and interpretation:
- Consider two cities with different populations. To understand the likelihood of a theft occurring, calculate a rate per population unit (e.g., per 100,000 people):
- If City A has 237 thefts and a population of 237,000, the rate per 100,000 people can be computed as:
- $$ ext{Rate} = rac{ ext{thefts}}{ ext{population}} imes 100{,}000 \