Module 7 – Z-Scores

The lecture introduces the concept of Z-scores and how they are used to describe individual scores within a sample in terms of standard deviations from the mean.
The initial question involves comparing two students, Peter and Mary, from different statistics courses to determine who performed better relative to their class.
Peter scored 95 in professor 1's class where the mean was 85.
Mary scored 90 in professor 2's class where the mean was 80.
The answer to who performed better cannot be determined without knowing the variability (standard deviation) of scores in each class.

If the standard deviation on professor 1’s test was 10, Peter is within 1 standard deviation from the mean, meaning his score isn't exceptionally high.
If the standard deviation on professor 1’s test was 1, Peter’s score is 10 standard deviations from the mean, which is considered an outlier.
A score more than 2.5 standard deviations from the mean is generally considered an outlier.

The lecture reviews concepts related to describing a distribution of numbers:
- Creating histograms to show score distribution.
- Describing the shape of the distribution (unimodal, bimodal, multimodal, symmetrical, skewed).
- Measuring central tendency (mean, median, mode).
- Describing variability (variance or standard deviation).
These methods describe the entire sample, whereas Z-scores describe individual scores.

The process for determining outliers involves finding the standard deviation and determining a cutoff point for 2.5 standard deviations from the mean.
Example: mean = 50, standard deviation = 10.
- 10 * 2.5 = 25
- 50 - 25 = 25
- 50 + 25 = 75
- Anything lower than 25 or higher than 75 is an outlier.
25 is 2.5 standard deviations below the mean, and 75 is 2.5 standard deviations above the mean.

To calculate how many standard deviations a number is from the mean, follow these steps:
1. Subtract the mean of the sample from the raw score.
2. Divide the result by the standard deviation.
Example: Given a distribution with a mean of 50 and a standard deviation of 10, calculate how many standard deviations away from the mean is 25.
- 25 - 50 = -25
- -25 / 10 = -2.5
- 25 is 2.5 standard deviations below the mean.
The negative sign indicates the number is below the mean.

Calculating for 75 in the same distribution (mean of 50 and a SD of 10):
- 75 - 50 = 25
- 25 / 10 = 2.5
- 75 is 2.5 standard deviations above the mean.
Calculating for 70 in the same distribution:
- 70 - 50 = 20
- 20 / 10 = 2
- 70 is 2 standard deviations above the mean.
This means that 70 is twice the average amount above the mean.

A Z-score represents how many standard deviations above or below the mean a score is located.

Revisiting the initial question with additional information:
- Peter scored 95 in professor 1’s course, where the mean was 85 and the standard deviation was 5.6.
- Mary scored 90 in professor 2’s course, where the mean was 80 and the standard deviation was 6.2.
To determine who has bragging rights, convert both scores to Z-scores.

Peter’s data: score = 95, mean = 85, standard deviation = 5.6.
1. Subtract mean from the raw score: 95 - 85 = 10
2. Divide by the standard deviation: 10 / 5.6 = 1.79
Peter’s Z-score is 1.79, meaning he is 1.79 standard deviations above the mean in his distribution.

Mary’s data: score = 90, mean = 80, standard deviation = 6.2.
1. Subtract mean from the raw score: 90 - 80 = 10
2. Divide by the standard deviation: 10 / 6.2 = 1.61
Mary’s Z-score is 1.61, meaning she is 1.61 standard deviations above the mean in her distribution.

Peter has bragging rights because his Z-score (1.79) is higher than Mary’s (1.61).
Peter’s score was more “extreme” than Mary’s score.
Converting to Z-scores is called “standardizing scores” because it transforms scores in a way that allows comparison across different distributions.

The absolute value of a Z-score determines how extreme it is, regardless of the sign.
- For example, -5.2 is more extreme than +2.5.
The sign indicates whether the score is below or above the mean.

A Z-score is positive if the raw score is above the mean and negative if it is below the mean.
The larger the absolute value of the Z-score, the more unusual the raw score.
Example:
- A Z-score of +4 indicates the raw score is unusually far above the mean.
- A Z-score of -4 indicates the raw score is unusually far below the mean.
- Both scores are equally unusual but in different directions.

Z-scores are the same as standard deviations.
A raw score with a Z-score of 0 is 0 standard deviations from the mean, meaning it is the mean.
Raw scores with Z-scores more extreme than 2.5 (< -2.5 or > +2.5) are considered outliers.

Given a set of data with mean (M) = 100 and variance (s^2) = 25, convert scores of 75, 90, and 105 to Z-scores.
First, convert variance to standard deviation:
- s = \sqrt{25} = 5

The lecture mentions the importance of having mostly “normal” distributions.
The next module will cover further applications of Z-scores in the context of normal distributions.