Unit 1 - Exploring One Variable Data

Data Visualization

  • Categorical Data: Bar Graphs
  • Quantitative Data: Histograms (discrete or continuous), Stem & Leaf plots (discrete; can be split), Back-to-Back Stem & Leaf plots (discrete), Dotplots (discrete)

Describing Distributions (SOCS)

  • S: Shape (symmetric, skewed right, skewed left, bimodal, uniform, unimodal)
  • O: Outliers
  • C: Center (Mean: xˉ\bar{x}, Median: Med)
  • S: Spread (Range: max-min, IQR: Q3Q1Q3-Q1, Standard deviation: SxS_x)
  • Compare Distributions: Use comparative words (less than, greater than, similar to).
  • Resistant Measures: Median, IQR
  • Nonresistant Measures: Mean, Standard Deviation, Range

Boxplots

  • Based on the five-number summary: min, Q1Q1, Med, Q3Q3, max.
  • Modified boxplots show outliers as dots beyond the outlier test boundaries.
  • Q3+1.5(IQR)=UBQ3 + 1.5(IQR) = UB (Upper Boundary)
  • Q11.5(IQR)=LBQ1 - 1.5(IQR) = LB (Lower Boundary)

Empirical Rule

  • 68-95-99.7 rule.

Z-Scores and Percentiles

  • z=xμσz = \frac{x - \mu}{\sigma}
  • Percentile: percent below.

Linear Transformations

  • ax+bax + b: ´+b´ changes centers, ´ax´ changes centers and spreads.