Variance & Z scores
GY1421 Working With Geographic Information: Week 3 Summary
Transformations & Distributions
Data Transformations & Z-scores
Importance of Normal Distribution: Descriptive statistics require normally distributed data. Transformations may be needed if data is skewed.
Types of Distributions:
Symmetrical Distributions: Single peak, bell-shaped curve.
Multi-modal Distributions: Multiple peaks indicate different groups.
Skewed Distributions: Asymmetrical with long tails.
Positive skew: Tail to the right.
Negative skew: Tail to the left.
Transformation Techniques:
For Moderately Positive Skew:
Square-Root: NEWX = SQRT(X)
Reflect & Square-Root: newx = sqrt(k - x)
For Substantially Positive Skew: Logarithmic: NEWX = LG10(X). For zero: NEWX = LG10(X + C).
For Moderately Negative Skew:
Square-Root: newx = lg10(x)
Reflect & Square-Root: NEWX = SQRT(K - X)
For Substantially Negative Skew: Logarithmic: NEWX = LG10(K - X). Inverse: newx = 1/x.
Z-scores:
Indicate the number of standard deviations from the mean.
Negative z-score: below mean.
Positive z-score: above mean.
The Z-score distribution has mean = 0 and standard deviation = 1.
34.1% within 1 SD on either side; 13.6% within 2 SDs.
Z-scores can assess where data lies relative to a specific value using a standard normal probability table.