PB

UNIT 1 STATS PLUS

Statistics Basics

  • Statistics: Science of data → turns data into knowledge.

  • Population: Entire group being studied.

  • Sample: Small group from the population.

  • Variable: The characteristic being measured.


Types of Data

  • Quantitative (Numerical): Numbers (height, weight, test scores).

  • Qualitative (Categorical): Groups/categories (gender, colors, brands).


Parameters vs. Statistics

  • Parameter: Data from a population.

  • Statistic: Data from a sample.


Sampling Methods

  1. Simple Random Sample (SRS) – Everyone has an equal chance.

  2. Stratified Sample – Divide into groups, sample some from each.

  3. Cluster Sample – Divide into groups, sample entire groups.

  4. Systematic Sample – Pick every nth person from a list.


Types of Studies

  • Anecdote – Personal story (unreliable).

  • Observational Study – No interference, only observes associations.

  • Controlled Experiment – Researcher assigns treatments (determines causality).


Good Experiment Features

  • Large sample size (at least 30).

  • Random assignment (reduces bias).

  • Blinding (hides treatment to prevent bias).

  • Placebo (fake treatment for comparison).


Two-Way Tables

  • Compare two variables (e.g., commuters vs. breakfast habits).

  • Percentage calculations: part/total × 100%.


Graphs for Numerical Data

  • Dotplot – Each data point is a dot.

  • Histogram – Data grouped into intervals (bars).

  • Stemplot – Breaks data into stems & leaves.

Describe Distributions

  • Shape: Symmetric or skewed?

  • Center: Middle value (mean or median).

  • Spread: How spread out the data is.


Graphs for Categorical Data

  • Bar Graph – Separate bars for each category.

  • Pie Chart – Shows percentage of each category.


Measures of Center (Averages)

  • Mean (x̄) = sum of data ÷ number of values.

  • Median = Middle value (after sorting).

  • Mode = Most frequent value.


Measures of Dispersion (Spread of Data)

  • Range = Highest value - Lowest value.

  • Standard Deviation (σ or s) = Average difference from the mean.


Empirical Rule (68-95-99.7 Rule)

  • For symmetric (normal) distributions:

    • 68% of data within 1 standard deviation.

    • 95% of data within 2 standard deviations.

    • 99.7% of data within 3 standard deviations.