STAT/MATH 3379 Elementary Statistics: Chapter 3 - Averages and Variation
Z-Score
Let x be a value from a population with mean \mu and standard deviation \sigma.
The z-score (z) for x represents how many standard deviations x is from its population mean.
Formula: z = \frac{x - \mu}{\sigma}.
A positive z-score means the value is above the mean; a negative z-score means it's below the mean.
Percentiles
Divide a data set into hundredths.
For a number p between 1 and 99, the p-th percentile separates the lowest p\% of the data from the highest (100 - p)\% .
A raw score does not necessarily correspond to the same percentile; percentile indicates relative position.
Quartiles
Divide a data set into four approximately equal pieces.
First Quartile (Q_1): Separates the lowest 25\% of data from the highest 75\%.
Second Quartile (Q2): Separates the lowest 50\% of data from the highest 50\%. Q2 is the same as the median.
Third Quartile (Q_3): Separates the lowest 75\% of data from the highest 25\%.
Interquartile Range (IQR)
A measure of variation that gives the range of the middle portion (about half) of the data.
Formula: IQR = Q3 - Q1.
Detecting Outliers
Step 1: Find Q1 and Q3.
Step 2: Compute IQR = Q3 - Q1.
Step 3: Compute outlier boundaries:
Lower Outlier Boundary = Q_1 - 1.5(IQR)
Upper Outlier Boundary = Q_3 + 1.5(IQR)
Step 4: Any data value less than the lower outlier boundary or greater than the upper outlier boundary is an outlier.
Box-and-Whisker Plot
Highlights important features of a data set using a five-number summary:
Minimum value
First Quartile (Q_1)
Median (Q_2)
Third Quartile (Q_3)
Maximum value