Quantitative Data part 2
Standard Deviation and Variation
Population parameter:
Mean: µ (mew)
Standard deviation: σ (sigma)
Variance: σ^2
Sample statistic
Mean: x̄ (x bar)
Standard deviation: s
Variance: s^2
Standard deviation: typical distance between mean and values in a distribution
If a value is less than the mean is added to the data set, the distribution decreases
Once observations become more spread out, SD (standard deviation) increases
Either 0 or any positive number
Not resistant to outliers
Standard deviation: units
Variation: units^2
Interquartile Range (IQR)
Percentile
- The pth percentile is defined as the value that P percent of values are less than or equal to it
Quartiles
- First quartile (Q1) (lower quartile) (25th percentile)
- Median (Q2) (50th percentile)
- Third quartile (Q3) (75th percentile)
- Order values in increasing order, split at median
- Everything to the left is data for Q1, and to the right is Q3
IQR
The difference of Q1 and Q3
Resistant to outliers, usable to show center when data isn’t symmetrical
Identifying outliers
- Value is an outlier if less than Q1-(1.5 x IQR) or greater than Q2-(1.5 x IQR)
Five Number Summary
- Minimum
- Lower Quartile (Q1)
- Median (Q2)
- Higher Quartile (Q3)
- Maximum
Additive Transformation
- When a constant is added or subtracted from the original values
- Shifts center by that amount (mean/median)
- Doesn’t affect spread or shape (IQR, Range, SD)
Multiplicative Transformation
- When a constant multiplies or divides all original values
- Affects center (mean/median)
- Affects spread (IQR, Range, SD)
- Doesn’t affect spread