Notes on Transformations, Z-Scores, and Percentiles

Transforming Data: Adding/Subtracting and Multiplying/Dividing

  • Adding/Subtracting a constant 'a':
    • Measures of center/location (mean, median, percentiles) change by +a+a or a-a.
    • Measures of variability (range, IQR, standard deviation) do not change.
    • Shape of the distribution does not change. Shifts distribution left/right.
  • Multiplying/Dividing by a positive constant 'b':
    • Measures of center/location and variability are multiplied/divided by bb.
    • Shape of the distribution does not change. Scales the distribution.
  • Combined Transformations (linear transformation Y=bX+aY = bX + a):
    • Mean: μY=bμX+a\mu_Y = b\mu_X + a
    • Standard Deviation: σY=bσX\sigma_Y = |b|\sigma_X
    • Shape remains unchanged.

Standardizing Distributions: Z-scores

  • Definition: A z-score measures how many standard deviations a data value XX is from the mean μ\mu.
  • Formula: Z=XμσZ = \frac{X - \mu}{\sigma}
  • Effects on Distribution:
    • Shape: Unchanged.
    • Center: Mean of z-scores is 00.
    • Variability: Standard deviation of z-scores is 11.
  • Purpose: Allows comparison of values from different distributions on a common scale.

Percentiles and Cumulative Relative Frequency

  • p-th percentile: The value below which pp percent of the observations fall.
    • Formula: P(Xxp)=p100P(X \le x_p) = \frac{p}{100}
  • Cumulative Relative Frequency Graphs: Used to estimate percentiles for individual values and vice versa.

Check Your Understanding (Knoebels Amusement Park Example)

  • Initial: Mean μ=1.705\mu = 1.705 dollars, Std Dev σ=0.447\sigma = 0.447 dollars.
  1. Convert to Cents (Multiply by 100):
    • Shape: Unchanged.
    • Mean: 1.705×100=170.51.705 \times 100 = 170.5 cents.
    • Std Dev: 0.447×100=44.70.447 \times 100 = 44.7 cents.
  2. Increase by 25 Cents (Add 25):
    • Shape: Unchanged.
    • Mean: 170.5+25=195.5170.5 + 25 = 195.5 cents.
    • Std Dev: 44.744.7 cents (unchanged by addition).
  3. Convert to Z-scores:
    • Shape: Unchanged.
    • Mean: 00.
    • Std Dev: 11.
    • The new mean (195.5 cents) and SD (44.7 cents) would be used for the z-score calculation.Z=X195.544.7Z' = \frac{X' - 195.5}{44.7}

Summary: Key Concepts and Formulas

  • Transformations:
    • Add/Subtract 'a': Location measures change by 'a'. Variability (range, IQR, SD) and shape unchanged.
    • Multiply/Divide by 'b': Location measures and variability scale by 'b'. Shape unchanged.
    • Linear Transformation Y=bX+aY = bX + a:
    • Mean: μY=bμX+a\mu_Y = b\mu_X + a
    • Std Dev: σY=bσX\sigma_Y = |b|\sigma_X
  • Z-scores: Z=XμσZ = \frac{X - \mu}{\sigma}
    • Standardized distribution has mean 00, standard deviation 11, and original shape.
  • Percentiles: The pp-th percentile is the value xpx_p such that P(Xxp)=p100P(X \le x_p) = \frac{p}{100}.
    • Cumulative relative frequency graphs help visualize