Unit 1 Summary: One Variable Data
Overview of Unit 1: One Variable Data
Focuses on analyzing one variable across multiple samples or groups to prepare for tests.
Types of Data
Categorical Data: Easier and quicker to analyze. Generally only a small percentage of the unit.
Quantitative Data: Takes up a larger portion of the unit; involves numerical values that can be measured or counted.
Definitions
Statistic: Summary information from sample data.
Parameter: Summary information from an entire population.
Variable: A characteristic that can change from one individual to another (e.g., eye color, height).
Categorical Data
Organized into frequency tables and relative frequency tables (proportions).
Can be graphed using pie charts or bar graphs.
Important to describe the distribution: identify the most and least frequent categories.
Quantitative Data
Types: Discrete (countable) and Continuous (infinitely measurable).
Analyzed using frequency tables, requires creation of bins for grouping data.
Graph Types
Dot Plot: Individual values represented by dots.
Stem-and-Leaf Plot: Displays individual values while showing distribution.
Histogram: Preferred graph for quantitative data, bars represent frequency.
Cumulative Graph: Shows proportions of data below certain values.
Important Distributions
Describe distributions using: shape, center, spread, and outliers.
Common shapes include symmetric, skewed, unimodal, bimodal, and uniform.
Summary Statistics
Mean: Average, sensitive to outliers.
Median: Middle value, not sensitive to outliers.
Percentiles and Quartiles: Describe data position (1st quartile = 25th percentile, median = 50th percentile, 3rd quartile = 75th percentile).
Measures of Spread: Range (max - min), Interquartile Range (IQR = Q3 - Q1), Standard Deviation (average distance from the mean).
Outliers
Identified using Fence Method: Upper fence = Q3 + 1.5(IQR), Lower fence = Q1 - 1.5(IQR).
Alternative method: How many standard deviations away from the mean (threshold of 2 standard deviations).
Box Plots
Visual representation using five-number summary (min, Q1, median, Q3, max).
Modified box plots indicate outliers.
Normal Distribution
Approximately normal distributions are unimodal, symmetrical, and defined by mean and standard deviation.
Empirical Rule: 68% within 1 SD, 95% within 2 SDs, and 99.7% within 3 SDs of the mean.
Convert values to standardized scores (Z-scores) to compare different distributions.
Conclusion
Mastering data analysis in Unit 1 provides a strong foundation for subsequent units in AP Statistics.