Variance & Z scores

GY1421 Working With Geographic Information: Week 3 Summary

Transformations & Distributions

Data Transformations & Z-scores

  • Importance of Normal Distribution: Descriptive statistics require normally distributed data. Transformations may be needed if data is skewed.

  • Types of Distributions:

    • Symmetrical Distributions: Single peak, bell-shaped curve.

    • Multi-modal Distributions: Multiple peaks indicate different groups.

    • Skewed Distributions: Asymmetrical with long tails.

      • Positive skew: Tail to the right.

      • Negative skew: Tail to the left.

Transformation Techniques:

  • For Moderately Positive Skew:

    • Square-Root: NEWX = SQRT(X)

    • Reflect & Square-Root: newx = sqrt(k - x)

  • For Substantially Positive Skew: Logarithmic: NEWX = LG10(X). For zero: NEWX = LG10(X + C).

  • For Moderately Negative Skew:

    • Square-Root: newx = lg10(x)

    • Reflect & Square-Root: NEWX = SQRT(K - X)

  • For Substantially Negative Skew: Logarithmic: NEWX = LG10(K - X). Inverse: newx = 1/x.

Z-scores:

  • Indicate the number of standard deviations from the mean.

    • Negative z-score: below mean.

    • Positive z-score: above mean.

  • The Z-score distribution has mean = 0 and standard deviation = 1.

    • 34.1% within 1 SD on either side; 13.6% within 2 SDs.

  • Z-scores can assess where data lies relative to a specific value using a standard normal probability table.