Normal Distribution Notes

Probability Density Curve

  • A probability density curve describes the distribution of a variable.
  • It indicates the proportion of the population within a given interval.
  • The area under the curve between two values, aa and bb, represents:
    • The proportion of the population with values between aa and bb.
    • The probability that a randomly selected value from the population will be between aa and bb.

Properties of Probability Density Curves

  • The area above a single point is zero, meaning for a continuous random variable xx, the probability that x=ax = a is 0 for any number aa.
  • P(a < x < b) = P(a \leq x \leq b), meaning including or excluding endpoints doesn't change the probability.
  • The total area under the curve is 1, representing the entire population.

Normal Distribution

  • Many statistical procedures rely on the normal curve.
  • A population represented by a normal curve is normally distributed.
  • The population mean determines the location of the peak of the normal curve.
  • The population standard deviation measures the spread:
    • Large standard deviation: wide, flat curve.
    • Small standard deviation: tall, narrow curve.
  • In a normal distribution:
    • Mean = Median = Mode
  • Empirical Rule:
    • Approximately 68% of data within one standard deviation of the mean.
    • Approximately 95% of data within two standard deviations of the mean.
    • Approximately 99.7% of data within three standard deviations of the mean.

Standardization (Z-score)

  • The z-score indicates the number of standard deviations a data value is above or below the mean.
  • Standardization formula: z=xμσz = \frac{x - \mu}{\sigma}, where:
    • xx is a value from a normal distribution.
    • μ\mu is the mean of the distribution.
    • σ\sigma is the standard deviation of the distribution.
  • Example: For a woman with height x=67x = 67 inches from a normal population with μ=64\mu = 64 inches and σ=3\sigma = 3 inches, the z-score is z=67643=1z = \frac{67 - 64}{3} = 1.
    • A value of 67 from N(64,3) is equivalent to a z-score of 1 from N(0,1).

Standard Normal Curve

  • A normal distribution with a mean of 0 and a standard deviation of 1 is the standard normal distribution.
  • Z-scores converted from a normal distribution follow a standard normal distribution.

Finding Area Between Z-scores in Excel

  • Use NORM.S.DIST command to find area to the left of a z-score in a standard normal distribution.
  • Syntax: NORM.S.DIST(z_score, cumulative)
    • z_score: The z-score.
    • cumulative: TRUE for cumulative probability.
  • To find the area between two z-scores, subtract the smaller area from the larger area.
  • For example, to find the area between z = -1.45 and z = 0.42:
    • Area to the left of -1.45: NORM.S.DIST(-1.45, TRUE) = 0.0735
    • Area to the left of 0.42: NORM.S.DIST(0.42, TRUE) = 0.6628
    • Area between -1.45 and 0.42: 0.66280.0735=0.58930.6628 - 0.0735 = 0.5893

Finding Z-score for a Given Area in Excel

  • Use NORM.S.INV command to find the z-score corresponding to a given area to the left under the standard normal curve.
  • Syntax: NORM.S.INV(probability)
    • probability: The area to the left of the desired z-score.
  • Example: To find the z-score with an area of 0.26 to its left:
    • NORM.S.INV(0.26) = -0.6433