Ch. 5.4-5.5 Notes: Continuous Random Variables, Normal Distribution, Binomial & Poisson Distributions

LO Summary

  • LO 5.1 Describe probability concepts and the rules of probability.
  • LO 5.2 Apply the total probability rule and Bayes’ theorem.
  • LO 5.3 Describe a discrete random variable and its probability distribution.
  • LO 5.4 Calculate probabilities for binomial and Poisson distributions.
  • LO 5.5 Describe the normal distribution and calculate its associated probabilities.

5.5 Continuous Random Variables and the Normal Distribution

  • A random variable is a function that assigns numerical values to the outcomes of an experiment.
  • Discrete vs. continuous:
    • Discrete random variable: takes a countable set of distinct values.
    • Continuous random variable: takes an uncountable set of values within an interval.
  • Continuous random variable and probability density function (PDF) f_X:
    • The area under fX over all values of X equals 1: </em>+fX(x)dx=1.\,\int</em>{-\infty}^{+\infty} f_X(x) \,dx = 1.
    • The probability that X lies in an interval is the area under fX between the endpoints: P(aXb)=</em>abfX(x)dx.P(a \le X \le b) = \int</em>{a}^{b} f_X(x) \,dx.
  • For continuous variables, P(X = x) = 0 for any exact value x, because there is zero area at a point.
    • Probability is meaningful only for intervals: P(a \le X \le b) = P(a < X < b) = P(a < X \le b) = P(a \le X < b).

Uniform Distribution

  • Uniform distribution is the simplest continuous distribution.
  • PDF is constant between two limits a and b:
    • For a ≤ x ≤ b, f(x)=1ba;f(x) = \frac{1}{b-a}; outside [a, b], f(x) = 0.
  • Area under f(x) over [a, b] equals 1 because width is (b−a) and height is 1/(b−a).
  • Mean and standard deviation:
    • μ=E[X]=a+b2.\mu = E[X] = \frac{a+b}{2}.
    • σX=Var(X)=ba12.\sigma_X = \sqrt{Var(X)} = \frac{b-a}{\sqrt{12}}.
  • Example (uniform on [120, 180]):
    • Mean: μ=120+1802=150.\mu = \frac{120+180}{2} = 150.
    • Standard deviation: σ=18012012=601217.32.\sigma = \frac{180-120}{\sqrt{12}} = \frac{60}{\sqrt{12}} \approx 17.32.
    • Probability example: P(140 < X < 150) = \frac{150-140}{180-120} = \frac{10}{60} = \frac{1}{6} \approx 0.1667.

The Normal Distribution (N(µ, σ²))

  • The normal distribution is a widely used continuous distribution:
    • Bell-shaped, symmetric around its mean.
    • Asymptotic: f(x) decreases as |x| grows.
    • Closely approximates many natural processes; cornerstone of statistical inference and sampling theory.
  • Cumulative distribution function (CDF): use P(X ≤ x) to compute probabilities, which is the area under the normal curve up to x:
    • For a normal X ~ N(µ, σ²):P(Xx)=Φ(xμσ),P(X \le x) = \Phi\left(\frac{x-\mu}{\sigma}\right), where Φ is the standard normal CDF.
  • Standard normal and R equivalents:
    • Standardize: z = (x − µ)/σ.
    • R example: P(Xx)=pnorm(q=z,mean=μ,sd=σ).P(X \le x) = \text{pnorm}(q = z, mean = \mu, sd = \sigma).
  • Example: Normal scores on a management aptitude exam with μ=72,σ=8.\mu = 72, \sigma = 8.
    • a) Probability X > 60:
    • Z = (60 − 72)/8 = −1.5; P(X > 60) = 1 − Φ(−1.5) = Φ(1.5) ≈ 0.9332.
    • Numerical form: P(X>60) = 1 - \text{pnorm}(q=60, mean=72, sd=8) \approx 0.9332.
    • b) Probability 68 ≤ X ≤ 84:
    • z for 84: (84−72)/8 = 1.5; z for 68: (68−72)/8 = -0.5.
    • Probability: Φ(1.5) − Φ(-0.5) ≈ 0.9332 − 0.3085 = 0.6247.
  • Inverse/percentiles (quantiles): Given probability p, find x such that P(X ≤ x) = p.
    • If X ~ N(µ, σ²), the p-th percentile is: x<em>p=μ+σz</em>p,x<em>p = \mu + \sigma z</em>p, where z_p is the p-th quantile of the standard normal.
    • Example with μ=72,σ=8\mu=72, \sigma=8:
    • 90th percentile (p = 0.90): z{0.90} ≈ 1.2816 → x</em>0.90=72+8(1.2816)82.25.x</em>{0.90} = 72 + 8(1.2816) ≈ 82.25.
    • 25th percentile (p = 0.25): z{0.25} ≈ -0.6745 → x</em>0.25=72+8(0.6745)66.60.x</em>{0.25} = 72 + 8(-0.6745) ≈ 66.60.
  • Empirical rule (68-95-99.7):
    • 68% within ±1σ, 95% within ±2σ, 99.7% within ±3σ.
    • In z-scale: ±1 → 68%, ±2 → 95%, ±3 → 99.7%.
  • Quick percentile table (z, percentile examples):
    • z = ±1: ±34% on each side from the center (total 68% within ±1).
    • z = ±2: central 95% (34% between 1 and 2, etc.).
    • z = ±3: central 99.7%.
  • Important: If you know probabilities for a normal, you can solve for X values using xp = µ + σ zp, and if you know X, you can compute p via p = Φ((x−µ)/σ).

5.4 The Binomial and Poisson Distributions

  • Bernoulli process: a sequence of n independent, identical trials with two outcomes (success with probability p, failure with probability 1−p).
    • A binomial random variable X is the number of successes in n trials: X ∈ {0,1,2,…,n}.
  • Binomial distribution (X ~ Binomial(n, p)):
    • PMF: P(X=x)=(nx)px(1p)nx,x=0,1,2,,n.P(X=x) = \binom{n}{x} p^{x} (1-p)^{n-x}, \quad x = 0,1,2,\ldots,n.
    • Mean: E[X]=μ=np.\mathbb{E}[X] = \mu = np.
    • Variance: Var(X)=σ2=np(1p).\operatorname{Var}(X) = \sigma^2 = np(1-p).
  • Poisson process and Poisson distribution (X ~ Poisson(µ)):
    • Poisson PMF: P(X=x)=eμμxx!,x=0,1,2,.P(X=x) = \frac{e^{-\mu} \mu^{x}}{x!}, \quad x = 0,1,2,\ldots.
    • Mean: E[X]=μ.\mathbb{E}[X] = \mu.
    • Variance: Var(X)=σ2=μ.\operatorname{Var}(X) = \sigma^2 = \mu.
  • When to use each:
    • Binomial: fixed number of independent trials n with constant p.
    • Poisson: number of events in a fixed interval when events occur with a constant mean rate and independently of each other (Poisson process).
  • Excel/R helpers:
    • Binomial: BINOM.DIST, dbinom, pbinom.
    • Poisson: POISSON.DIST, dpois, ppois.
  • Examples
    • Binomial example 1: 30% of customers react positively to new web features; n = 5.
    • a) P(X = 0): none react positively = $(1-p)^n = 0.70^5 \approx 0.1681$.
    • b) E[X] = np = 5 × 0.30 = 1.5.
    • Binomial example 2: 100 adults, p = 0.68; X = number of Facebook users.
    • a) P(X = 70) ≈ 0.0791 (via BINOM.DIST(70, 100, 0.68, FALSE)). Also dbinom(70, 100, 0.68).
    • b) P(X ≤ 70) ≈ 0.7007 (via BINOM.DIST(70, 100, 0.68, TRUE)). Also pbinom(70, 100, 0.68).
  • Poisson process probabilities and common expressions:
    • Example: 18 visits per 30 days on average for a Starbucks customer.
    • a) Over a 5-day period, mean visits μ_5 = 3 (since 18 per 30 days ⇒ 18 × (5/30) = 3).
    • b) P(X = 5) = e^{−3} 3^5 / 5! ≈ 0.1008.
    • General idea: the probability of a count in a fixed interval uses λ (the mean number of events in that interval).
    • It is common to scale the interval to compute the corresponding μ for that period: μ{new} = λ × (newintervallength / originalinterval_length).
  • Poisson examples with weekly timeframe:
    • Example: Craft breweries open at an average rate of 1.5 per day; over a week (7 days), mean μ = 1.5 × 7 = 10.5.
    • a) P(X ≤ 10) ≈ 0.5207 (Poisson with λ = 10.5).
    • b) P(X = 10) ≈ 0.1236.
  • Quick references:
    • Excel: POISSON.DIST for Poisson probabilities.
    • R: dpois for PMF, ppois for CDF; dbinom and pbinom for binomial probabilities.

Connections, Implications, and Practical Notes

  • Choice of model depends on the nature of the data: continuous (normal, uniform) vs discrete (binomial, Poisson).
  • The normal distribution underpins many statistical methods (sampling distributions, confidence intervals, hypothesis tests) due to the central limit theorem and its mathematical properties.
  • For continuous variables, we rely on densities and CDFs; for discrete variables, we rely on PMFs and cumulative probabilities.
  • In practice, use standardization (z-scores) and inverse CDFs (quantiles) to transform problems into standard normal computations or to find threshold values corresponding to desired probabilities.
  • When n is large or p is near 0 or 1, Poisson approximations to Binomial can be useful, though exact Binomial calculations are straightforward with software.

Summary of Key Formulas (LaTeX)

  • Uniform distribution on [a, b]:
    • PDF: f(x)={1ba,amp;axb 0,amp;otherwisef(x) = \begin{cases} \frac{1}{b-a}, &amp; a \le x \le b \ 0, &amp; \text{otherwise} \end{cases}
    • Mean: μ=a+b2\mu = \frac{a+b}{2}
    • Standard deviation: σ=ba12\sigma = \frac{b-a}{\sqrt{12}}
  • Normal distribution: XN(μ,σ2)X \sim N(\mu, \sigma^2)
    • CDF: P(Xx)=Φ(xμσ)P(X \le x) = \Phi\left(\frac{x-\mu}{\sigma}\right)
    • Percentiles: x<em>p=μ+σz</em>px<em>p = \mu + \sigma z</em>p where zp=Φ1(p)z_p = \Phi^{-1}(p)
  • Binomial distribution: XBinomial(n,p)P(X=x)=(nx)px(1p)nxX \sim \text{Binomial}(n, p)\quad P(X=x) = \binom{n}{x} p^{x} (1-p)^{n-x}
    • Mean: E[X]=np\mathbb{E}[X] = np
    • Variance: Var(X)=np(1p)\operatorname{Var}(X) = np(1-p)
  • Poisson distribution: XPoisson(μ)P(X=x)=eμμxx!X \sim \text{Poisson}(\mu)\quad P(X=x) = \frac{e^{-\mu} \mu^{x}}{x!}
    • Mean/Variance: E[X]=μ,Var(X)=μ\mathbb{E}[X] = \mu, \quad \operatorname{Var}(X) = \mu

Quick Practice Problems (optional)

  • Problem A (Uniform): Let X ~ U[10, 20]. Compute P(12 ≤ X ≤ 15).
  • Problem B (Normal): If X ~ N(100, 15^2), find the 95th percentile.
  • Problem C (Binomial): n = 8, p = 0.25. Find P(X = 3) and E[X].
  • Problem D (Poisson): A call center receives an average of 6 calls per hour. What is P(X ≤ 4) in an hour?