Normal & Binomial Distributions Study Notes

Normal & Binomial Distributions Study Notes

Introduction to Statistical Methods

  • Course: STAT 2331

  • Institution: SMU

Types of Random Variables

  • Random variables can be classified into two main types:

    • Discrete Random Variables: Probabilities are computed by adding and subtracting outcomes.

    • Continuous Random Variables: Probabilities are calculated by finding the area under a density curve.

Focus of the Module

  • This module primarily addresses two significant types of probability distributions:

    • Normal Distributions

    • Binomial Distributions

Objectives for Normal & Binomial Distributions

  • Determine if a distribution is approximately normal.

  • Apply the Empirical Rule to normally distributed data.

  • Transform normally distributed values to the standard normal distribution.

  • Calculate probabilities for normal random variables.

  • Perform inverse normal calculations to find percentiles.

  • Assess when a count can be modeled using the binomial distribution.

  • Calculate probabilities for a binomial random variable.

  • Use the normal approximation for the binomial distribution when suitable.

Normal Distributions

Characteristics and Properties
  • Normal distributions are bell-shaped; they are also known as symmetric and unimodal.

  • Notation for a normal distribution: N(\mu, \sigma) where

    • \mu = mean

    • \sigma = standard deviation

  • The area under the entire curve of a normal distribution equals 1.

Normal Q-Q Plots

  • It is essential to visually assess if data is normally distributed. A normal quantile-quantile (Q-Q) plot can be used.

    • Distributions where points closely follow the reference line indicate normality.

    • Systematic deviations from the line suggest the distribution is not normal.

The Empirical Rule (68-95-99.7 Rule)

  • For approximately normal distributions:

    • 68% of observations fall within 1 standard deviation of the mean (\sigma).

    • 95% of observations fall within 2 standard deviations of the mean (2\sigma).

    • 99.7% of observations fall within 3 standard deviations of the mean (3\sigma).

Examples for Normal Distributions

Example 1
  • Exam scores are normally distributed with:

    • Mean \mu = 80

    • Standard deviation \sigma = 5

  • Calculate the following probabilities:

    • 68% of test scores are between: [80 - 5, 80 + 5] = [75, 85]

    • 95% of test scores are between: [80 - 2(5), 80 + 2(5)] = [70, 90]

    • 99.7% of test scores are between: [80 - 3(5), 80 + 3(5)] = [65, 95]

  • What proportion of scores is higher than 75? (Use Z-scores and normal distribution properties.)

  • What proportion of scores is between 70 and 85?

Example 2
  • Heights for adult females (normal distribution):

    • \mu = 65 inches, \sigma = 3.5 inches

  • Heights for adult males (normal distribution):

    • \mu = 70 inches, \sigma = 4.0 inches

  • For a randomly selected female and male:

    • Graph both distributions.

    • Find what female height is one standard deviation above the mean: \mu + \sigma = 65 + 3.5 = 68.5 inches.

    • Find what male height is three standard deviations below the mean: \mu - 3\sigma = 70 - 3(4) = 58 inches.

    • Calculate the number of standard deviations from the mean for a female height of 72 inches using:
      \mu_X + z\sigma_X = 72.

    • Similarly, calculate for male height of 62 inches.

Z-scores

  • The formula for calculating a Z-score is:
    z = \frac{x - \mu_X}{\sigma_X}

  • Where:

    • X is any random variable

    • x is a specific value of X

    • z indicates how many standard deviations x is from the mean \mu

  • Z-scores allow us to standardize observations.

Standard Normal Distribution

  • If a variable is normally distributed:
    X \sim N(\mu, \sigma)

  • Z-scores are also normally distributed with mean \mu_Z = 0 and standard deviation \sigma_Z = 1:
    Z \sim N(0, 1)

  • Probabilities for X are equivalent to those for Z: P(X < x) = P(Z < z) .

    • This notation represents the probability to the left of x or its equivalent Z-score.

Examples of Calculating Probabilities

Example 4
  • The distribution of female heights is X \sim N(65, 3.5).

  • Probability that a randomly selected adult female is shorter than 68 inches:

    • State in mathematical notation and calculate the Z-score.

    • Draw a graph aligning both distribution curves.

    • Use properties to find:
      P(X < 68) = P(Z < z) resulting in z = 0.857142 .

Calculating Normal Probabilities Using Software and Calculators

  • In Excel: Use:

    • For cumulative probabilities: =NORM.DIST(x,\mu,\sigma,TRUE)

    • For probabilities: =NORM.DIST(x,\mu,\sigma,FALSE) .

  • On a TI calculator:

    • Use:
      normalcdf(lower,upper,\mu,\sigma) .

  • Example calculation with a dragonfly wingspan:

Example 5.1
  • Distribution: X \sim N(4, 0.25).

  • Find probabilities less and greater than specific wingspan lengths.

Inverse Normal Calculations

  • Used when given probability and seeking the variable corresponding to it.

  • In Excel: Use:
    =NORM.INV(probability, \mu, \sigma) .

  • On a TI calculator: Use:
    invNorm(probability, \mu, \sigma) .

  • Example related to dragonfly wingspan percentiles.

Important Points for Normal Distributions

  • Notation: X \sim N(\mu, \sigma)

  • Use Q-Q plots to assess normality.

  • Empirical Rule: 68%, 95%, and 99.7% for standard deviations.

  • Z-scores transform X-values to Z-values standardizing distributions.

Transitioning from Quantitative to Categorical Data

  • Previous examples utilized quantitative data.

  • Focus shifts to categorical data modeling via binomial distributions.

Binomial Distributions

  • A binomial setting must satisfy four conditions:

    • B – Binary: Each trial results in a success or failure.

    • I – Independent: The outcome of each trial is not affected by the others.

    • N – Number: A fixed number of trials, denoted as n.

    • S – Success: A constant probability of success, denoted as p.

Binomial Random Variable
  • Notation for the binomial distribution: X \sim B(n,p)

  • The random variable X counts successes in n trials.

Binomial Probability Formula
  • The formula is: P(X = k) = \frac{n!}{k!(n-k)!} p^k (1-p)^{n-k}

    • Where:

    • X = number of successes in n trials

    • k = 0,1,…,n

    • p = probability of success

    • 1 - p = probability of failure

    • n! = n(n-1)(n-2)…(3)(2)(1)

Application Examples for Binomial Distributions
Example 7
  • Describe a scenario using a weighted coin to assess a binomial setting with probability outcomes.

Example 8
  • Implementation of a binomial distribution with sampling in a small population, including consultations of success rates.

Software for Calculating Binomial Probabilities

  • Probabilities can be calculated using software easily:

  • On TI-83 Calculator, use:
    binompdf(n,p,x) for probabilities,
    binomcdf(n,p,x) for cumulative probabilities.

  • In Excel, use:
    =BINOM.DIST(x,n,p,cumulative)

Mean, Standard Deviation, and Normal Approximation for Binomial Distributions

  • For a binomially distributed variable:
    \mu = np and \sigma = \sqrt{np(1-p)}

  • As the sample size n becomes large, the binomial distribution can be approximated by a normal distribution given the conditions np \geq 10 and n(1-p) \geq 10.

Final Example and Calculation

Example 10
  • Assess the probability of defective items in a shipment using binomial distribution principles and normal approximation.

Important Points for Binomial Distributions

  • Notational convention: X \sim B(n,p) for binomial distributions.

  • The variable X signifies the count of successes over n trials.

  • Remember the four conditions embodied by the acronym BINS: Binary, Independent, Number, Success.

  • Calculate binomial probabilities using either manual methods or software mechanisms to streamline calculations.