Module 7 – Z-Scores
Z-Scores
Introduction
The lecture introduces the concept of Z-scores and how they are used to describe individual scores within a sample in terms of standard deviations from the mean.
The initial question involves comparing two students, Peter and Mary, from different statistics courses to determine who performed better relative to their class.
Peter scored 95 in professor 1's class where the mean was 85.
Mary scored 90 in professor 2's class where the mean was 80.
The answer to who performed better cannot be determined without knowing the variability (standard deviation) of scores in each class.
Understanding Variability
If the standard deviation on professor 1’s test was 10, Peter is within 1 standard deviation from the mean, meaning his score isn't exceptionally high.
If the standard deviation on professor 1’s test was 1, Peter’s score is 10 standard deviations from the mean, which is considered an outlier.
A score more than 2.5 standard deviations from the mean is generally considered an outlier.
Review of Distribution Concepts
The lecture reviews concepts related to describing a distribution of numbers:
Creating histograms to show score distribution.
Describing the shape of the distribution (unimodal, bimodal, multimodal, symmetrical, skewed).
Measuring central tendency (mean, median, mode).
Describing variability (variance or standard deviation).
These methods describe the entire sample, whereas Z-scores describe individual scores.
Individual Scores and Standard Deviations
The method to describe individual scores involves standard deviations.
The concept relates to the earlier discussion on outliers.
Outliers Revisited
The process for determining outliers involves finding the standard deviation and determining a cutoff point for 2.5 standard deviations from the mean.
Example: mean = 50, standard deviation = 10.
10 * 2.5 = 25
50 - 25 = 25
50 + 25 = 75
Anything lower than 25 or higher than 75 is an outlier.
25 is 2.5 standard deviations below the mean, and 75 is 2.5 standard deviations above the mean.
Calculating Standard Deviations from the Mean
To calculate how many standard deviations a number is from the mean, follow these steps:
Subtract the mean of the sample from the raw score.
Divide the result by the standard deviation.
Example: Given a distribution with a mean of 50 and a standard deviation of 10, calculate how many standard deviations away from the mean is 25.
25 - 50 = -25
-25 / 10 = -2.5
25 is 2.5 standard deviations below the mean.
The negative sign indicates the number is below the mean.
Examples of Calculating Standard Deviations
Calculating for 75 in the same distribution (mean of 50 and a SD of 10):
75 - 50 = 25
25 / 10 = 2.5
75 is 2.5 standard deviations above the mean.
Calculating for 70 in the same distribution:
70 - 50 = 20
20 / 10 = 2
70 is 2 standard deviations above the mean.
This means that 70 is twice the average amount above the mean.
Z-Scores Explained
A Z-score represents how many standard deviations above or below the mean a score is located.
Applying Z-Scores to the Initial Question
Revisiting the initial question with additional information:
Peter scored 95 in professor 1’s course, where the mean was 85 and the standard deviation was 5.6.
Mary scored 90 in professor 2’s course, where the mean was 80 and the standard deviation was 6.2.
To determine who has bragging rights, convert both scores to Z-scores.
Calculating Peter's Z-Score
Peter’s data: score = 95, mean = 85, standard deviation = 5.6.
Subtract mean from the raw score: 95 - 85 = 10
Divide by the standard deviation: 10 / 5.6 = 1.79
Peter’s Z-score is 1.79, meaning he is 1.79 standard deviations above the mean in his distribution.
Calculating Mary's Z-Score
Mary’s data: score = 90, mean = 80, standard deviation = 6.2.
Subtract mean from the raw score: 90 - 80 = 10
Divide by the standard deviation: 10 / 6.2 = 1.61
Mary’s Z-score is 1.61, meaning she is 1.61 standard deviations above the mean in her distribution.
Comparing Z-Scores
Peter has bragging rights because his Z-score (1.79) is higher than Mary’s (1.61).
Peter’s score was more “extreme” than Mary’s score.
Converting to Z-scores is called “standardizing scores” because it transforms scores in a way that allows comparison across different distributions.
Key Facts about Z-Scores
The absolute value of a Z-score determines how extreme it is, regardless of the sign.
For example, -5.2 is more extreme than +2.5.
The sign indicates whether the score is below or above the mean.
Interpreting Z-Scores
A Z-score is positive if the raw score is above the mean and negative if it is below the mean.
The larger the absolute value of the Z-score, the more unusual the raw score.
Example:
A Z-score of +4 indicates the raw score is unusually far above the mean.
A Z-score of -4 indicates the raw score is unusually far below the mean.
Both scores are equally unusual but in different directions.
Z-Scores and Standard Deviations
Z-scores are the same as standard deviations.
A raw score with a Z-score of 0 is 0 standard deviations from the mean, meaning it is the mean.
Raw scores with Z-scores more extreme than 2.5 (< -2.5 or > +2.5) are considered outliers.
Example with Variance
Given a set of data with mean (M) = 100 and variance (s^2) = 25, convert scores of 75, 90, and 105 to Z-scores.
First, convert variance to standard deviation:
s = \sqrt{25} = 5
More Examples with Z-Scores
Using M = 100 and s = 5
For 105:
105 - 100 = 5
5 / 5 = 1
Z-score = 1 (1 standard deviation above the mean).
For 90:
90 - 100 = -10
-10 / 5 = -2
Z-score = -2 (2 standard deviations below the mean).
For 75:
75 - 100 = -25
-25 / 5 = -5
Z-score = -5 (5 standard deviations below the mean).
Additional Examples
Given M = 120 and s^2 = 49, thus s = 7
For 130:
130 - 120 = 10
10 / 7 = 1.43
Z-score = 1.43 (1.43 standard deviations above the mean).
For 95:
95 - 120 = -25
-25 / 7 = -3.57
Z-score = -3.57 (3.57 standard deviations below the mean).
For 121:
121 - 120 = 1
1 / 7 = 0.14
Z-score = 0.14 (0.14 standard deviations above the mean).
Normal Distributions and Z-Scores
The lecture mentions the importance of having mostly “normal” distributions.
The next module will cover further applications of Z-scores in the context of normal distributions.