Week 3 (biostats) notes

  • The Normal Distribution (Gaussian)

    • Appropriate for continuous data; symmetric, bell-shaped
    • Parameters: mean μ (central value), standard deviation σ (spread)
    • Notation: X ~ N(μ, σ^2)
  • Population vs. Sample; Notation conventions

    • Greek letters for population parameters: μ (mean), σ (SD), π (proportion)
    • Latin letters for sample statistics:
    • sample mean:
    • sample SD: s
    • sample proportion: p̂
    • Example mappings: population mean μ, sample mean x̄; population SD σ, sample SD s; population proportion π, sample proportion p̂
  • Summary Statistics (recap)

    • Central tendency: mean, median, mode
    • Dispersion: SD, IQR, range
  • Checking for Normality

    • Use histogram or boxplot to assess symmetry
    • If data are roughly symmetric with mean ≈ median ≈ mode, favors normality
    • Skew indicators:
    • Left-skew (negatively skewed): mean ≤ median ≤ mode
    • Right-skew (positively skewed): mode ≤ median ≤ mean
  • Reference Ranges for the Normal Distribution

    • 68% within ±1 SD: [μ − σ, μ + σ]
    • 95% within ±2 SD: [μ − 2σ, μ + 2σ]
    • 99.7% within ±3 SD: [μ − 3σ, μ + 3σ]
  • 68–95–99.7% Rule (summary)

    • 68% of observations lie between μ − σ and μ + σ
    • 95% lie between μ − 2σ and μ + 2σ
    • 99.7% lie between μ − 3σ and μ + 3σ
  • Standard Normal Distribution and Z-score

    • Standard normal: Z ~ N(0,1)
    • Z-score: Z=XμσZ = \dfrac{X - \mu}{\sigma}
    • Use when reference range is not suitable or when comparing across different scales
    • If μ and σ are unknown, use sample mean and sample SD when sample size is large
  • Probability calculations with the normal distribution

    • For any X ~ N(μ, σ^2):
    • Probability between a and b: P(aXb)=Φ(bμσ)Φ(aμσ)P(a \le X \le b) = \Phi\left(\dfrac{b-\mu}{\sigma}\right) - \Phi\left(\dfrac{a-\mu}{\sigma}\right)
    • Φ(z) is the CDF of the standard normal distribution
    • Using the Z-table: to find P(Z > z), compute 1 − Φ(z)
  • Example: Pre-operative creatinine data (illustrative of distribution comparisons)

    • Normal-like vs skewed groups can be compared using mean/median and IQR
    • No dialysis (roughly symmetric): mean ≈ median
    • Yes dialysis (right-skewed): mean > median; report using median and IQR for skewed data
  • Worked example: Intracranial pressure (ICP) data

    • Given: ICP ~ N(μ = 20, σ = 5)
    • a) Median ICP? → 20 mmHg (Normally distributed: mean = median)
    • b) P(15 ≤ ICP ≤ 25)? → P(−1 ≤ Z ≤ 1) = 0.68
    • c) 95% limits? → [μ − 2σ, μ + 2σ] = [10, 30] mmHg
    • d) P(ICP > 27.5)?
    • Z for 27.5 = (27.5 − 20) / 5 = 1.5
    • P(Z > 1.5) ≈ 0.0668 → about 6.68%
    • Stepwise approach: diagram → compute Z → use standard normal table
  • Practical notes on using the standard normal table

    • Always start with a diagram to visualize regions
    • Z-table often provides area to the left; convert to the area of interest accordingly
    • For Z = 1.5, P(Z > 1.5) ≈ 0.0668
    • If needed, compute probabilities by transforming to Z and using Φ or the table
  • Quick reference formulas

    • Normal: XN(μ,σ2)X \sim \mathcal{N}(\mu, \sigma^2)
    • Z-score: Z=Xμσ,ZN(0,1)Z = \dfrac{X - \mu}{\sigma}, \quad Z \sim \mathcal{N}(0,1)
    • P(a ≤ X ≤ b) = Φ(bμσ)Φ(aμσ)\Phi\left(\dfrac{b-\mu}{\sigma}\right) - \Phi\left(\dfrac{a-\mu}{\sigma}\right)
    • 68–95–99.7% rule: within ±σ, ±2σ, ±3σ respectively
    • 95% limits: μ ± 2σ; 99.7% limits: μ ± 3σ
  • Tools and resources mentioned

    • GraphPad Prism for histograms/box plots and basic tests
    • Useful for creating graphs and performing simple statistical tests
  • Takeaway for quick recall

    • Normal distribution is symmetric with μ and σ as core parameters
    • Use Z-scores to compare values across different scales
    • When data are skewed, report median and IQR; use mean/SD for roughly normal data
    • For ICP-style data, you can answer: median, probability within a range, and 95% limits using μ and σ