Key Notes on Probability Distributions and the Normal Distribution

Probability Distributions

Normal Distribution

  • Commonly used statistical model for continuous variables.
  • Resembles a bell-shaped curve, symmetric about the mean.
  • Many biological measurements tend to follow a normal distribution.

Learning Outcomes

  • Knowledge
    • Understand that probabilities can be derived from histograms, but these are limited to sample data.
    • Know the concepts of interpolation (estimating within sample range) and extrapolation (estimating outside sample range).
    • Recognize the normal distribution as a mathematical model of a histogram, facilitating interpolation.
    • Familiar with other probability distributions (e.g., t, F, binomial, Poisson).
  • Understanding
    • Acknowledge the dangers of extrapolation beyond data limits.
    • Understand the utility of the Central Limit Theorem (CLT) in approximation of probabilities.
    • Grasp the reasoning behind why specific probability queries (e.g., P(Y = a certain value)) are meaningless when Y follows a normal distribution.
    • Differentiate between general normal distributions and the Standard Normal Distribution (SND).
  • Practical Skills
    • Calculate P(Y ≤ a specified value) or P(Y between two values) using properties of the normal distribution and Excel.

Characteristics of Probability Distributions

  • Normalization
    • A histogram visualizes frequency distributions, the shape of which often approximates a normal distribution.
    • Data fit into models can enhance the understanding of population parameters, but caution in extrapolation is critical.

Properties of the Normal Distribution

  • Described by two parameters:
    • \mu (mean) - determines the center of the distribution.
    • \sigma (standard deviation) - determines the width/shape of the distribution.
  • Useful empirical rules for normal distributions:
    • Approximately 68.27% of values lie within \mu \pm 1\sigma.
    • Approximately 95.45% lie within \mu \pm 2\sigma.
    • Approximately 99.73% lie within \mu \pm 3\sigma.

Central Limit Theorem (CLT)

  • As sample size increases, the distribution of the sample means approaches a normal distribution regardless of the original distribution.
  • Very useful for making statistical inferences and approximations.

Standard Normal Distribution (SND)

  • A specific normal distribution with:
    • Mean (\mu) = 0
    • Standard Deviation (\sigma) = 1
  • Allows for the comparison of scores from different normal distributions using Z-scores.
  • Z-score calculation:
    • Z = \frac{Y - \mu}{\sigma}
  • The SND is integral for determining proportions of data in areas under the curve.

Application in Excel

  • For calculating probabilities, utilize Excel's formula:
    • "=NORMDIST(Y, \mu, \sigma, TRUE)"
    • Where 'Y' is the value of interest, 'TRUE' specifies cumulative probability.

Practical Implications and Examples

  1. Height Distribution Example
    • Analyzed a height dataset of students to evaluate normality.
    • Found mean height, standard deviation, and checked frequencies against empirical rules.
  2. Comparison with Other Distributions
    • Compared normal distribution to real-world data such as birth weights, confirming its applicability across various biological measurements.

Conclusion

  • Understanding the normal distribution and its properties are crucial for applications in biology and statistics.
  • Probabilistic models are foundational for interpreting and managing biological data effectively.

Exercises

  • Calculate the probabilities or sketch the area of interest in cases where wing lengths or measurements differ.