Unit 1 Summary: One Variable Data

Overview of Unit 1: One Variable Data

  • Focuses on analyzing one variable across multiple samples or groups to prepare for tests.

Types of Data

  • Categorical Data: Easier and quicker to analyze. Generally only a small percentage of the unit.

  • Quantitative Data: Takes up a larger portion of the unit; involves numerical values that can be measured or counted.

Definitions

  • Statistic: Summary information from sample data.

  • Parameter: Summary information from an entire population.

  • Variable: A characteristic that can change from one individual to another (e.g., eye color, height).

Categorical Data

  • Organized into frequency tables and relative frequency tables (proportions).

  • Can be graphed using pie charts or bar graphs.

  • Important to describe the distribution: identify the most and least frequent categories.

Quantitative Data

  • Types: Discrete (countable) and Continuous (infinitely measurable).

  • Analyzed using frequency tables, requires creation of bins for grouping data.

Graph Types
  1. Dot Plot: Individual values represented by dots.

  2. Stem-and-Leaf Plot: Displays individual values while showing distribution.

  3. Histogram: Preferred graph for quantitative data, bars represent frequency.

  4. Cumulative Graph: Shows proportions of data below certain values.

Important Distributions

  • Describe distributions using: shape, center, spread, and outliers.

  • Common shapes include symmetric, skewed, unimodal, bimodal, and uniform.

Summary Statistics

  • Mean: Average, sensitive to outliers.

  • Median: Middle value, not sensitive to outliers.

  • Percentiles and Quartiles: Describe data position (1st quartile = 25th percentile, median = 50th percentile, 3rd quartile = 75th percentile).

  • Measures of Spread: Range (max - min), Interquartile Range (IQR = Q3 - Q1), Standard Deviation (average distance from the mean).

Outliers

  • Identified using Fence Method: Upper fence = Q3 + 1.5(IQR), Lower fence = Q1 - 1.5(IQR).

  • Alternative method: How many standard deviations away from the mean (threshold of 2 standard deviations).

Box Plots

  • Visual representation using five-number summary (min, Q1, median, Q3, max).

  • Modified box plots indicate outliers.

Normal Distribution

  • Approximately normal distributions are unimodal, symmetrical, and defined by mean and standard deviation.

  • Empirical Rule: 68% within 1 SD, 95% within 2 SDs, and 99.7% within 3 SDs of the mean.

  • Convert values to standardized scores (Z-scores) to compare different distributions.

Conclusion

  • Mastering data analysis in Unit 1 provides a strong foundation for subsequent units in AP Statistics.