Learning Objectives:
Determine and interpret z-scores
Interpret percentiles
Determine and interpret quartiles
Determine and interpret the interquartile range
Check a set of data for outliers
Z-Score Definition:
Represents the distance of a data value from the mean in terms of standard deviations.
Calculated as:
Z = (Data value - Mean) / Standard Deviation
Two types of z-scores: population z-score and sample z-score.
Characteristics:
Unitless
Mean = 0
Standard Deviation = 1
Imene:
Score: 88
Mean (m) = 73.2, Standard Deviation (s) = 8.5
Akito:
Score: 91
Mean (m) = 75.8, Standard Deviation (s) = 9.2
Z-Scores computed showed:
Imene: 1.74
Akito: 1.65
Conclusion: Imene performed better relatively despite both scoring above average.
Definition:
The kth percentile (Pk) is a value where k percent of observations are less than or equal to that value.
Example:
Antonia's SAT Mathematics score of 600 is in the 74th percentile, meaning 74% scored below her and 26% above.
Definition:
Quartiles split data into four equal parts:
Q1: 25th percentile
Q2: 50th percentile (median)
Q3: 75th percentile
Finding Quartiles:
Arrange data in ascending order.
Determine Q2 (median).
Divide data into halves to find Q1 and Q3.
Example:
Using Chicago ride-share data:
Q1 = 2.2 miles
Q2 (Median) = 4.85 miles
Q3 = 9.4 miles
Definition:
IQR measures the range of the middle 50% of data: IQR = Q3 - Q1.
For the ride-share data, IQR = 9.4 - 2.2 = 7.2 miles.
Definition:
Outliers are extreme observations that can skew analysis and may result from errors.
Steps to Identify Outliers:
Calculate Q1 and Q3.
Compute IQR.
Determine fences:
Lower Fence = Q1 - 1.5(IQR)
Upper Fence = Q3 + 1.5(IQR)
Identify outliers as values below the lower fence or above the upper fence.
Calculated Q1 = 2.2, Q3 = 9.4.
IQR found to be 7.2.
Fences:
Lower Fence = 2.2 - 10.8 = -8.6 miles (no outliers)
Upper Fence = 9.4 + 10.8 = 20.2 miles (rides 21.3, 21.6, 23.2, and 42.3 miles identified as outliers).
Learning Objectives:
Compute the five-number summary
Draw and interpret boxplots
Five-Number Summary Includes:
Minimum
Q1
Median (Q2)
Q3
Maximum
Example:
For the Chicago ride-share data:
Summary: MIN = 0.4, Q1 = 2.2, M = 4.85, Q3 = 9.4, MAX = 42.3
Steps to Draw a Boxplot:
Calculate lower and upper fences.
Create a number line, marking Q1, Q2, Q3.
Form the box and draw "whiskers" to the data values within the fences.
Mark outliers with an asterisk.
Example:
Analysis of shared vs. non-shared ride distances indicated differences in median distance and spread, suggesting a preference for sharing rides for shorter distances.