Module 5 Notes: Normal and Uniform Distributions
Module 5 Notes (Normal & Uniform Distributions) – Detailed study guide based on the transcript
Overview of today’s module
- Finish module 5 content, then wrap up module 4.
- Preview quiz 2 (this weekend).
- Topics for today: Introduction example, normal distribution, uniform distribution.
- Tools: PowerPoint slides and Minitab demonstrations (no dataset needed today).
Key ideas carried over from previous lectures
- Z-scores (standardization) allow comparing data from different scales by placing them on a common scale.
- Normal distribution (capital N) and the standard normal distribution (Z) are central to many statistical calculations.
- Relationships between data in original scale and standardized scale:
- If X ~ N(μ, σ^2), then the standardized score Z = (X - μ)/σ ~ N(0, 1).
- Normal distribution is symmetric; mean = median = mode; can be centered at any μ with any σ > 0.
- The z-score is the same concept discussed in Lecture 4; it underpins the use of the normal distribution for comparisons.
Normal distribution: core definitions
- Notation:
- X ∼ 𝒩(μ, σ^2) (X is distributed as a normal with mean μ and variance σ^2).
- The special case Z ∼ 𝒩(0, 1) is the standard normal distribution.
- Properties:
- Symmetric bell-shaped curve (unimodal, symmetric about μ).
- μ is the population mean; σ is the population standard deviation.
- Any normal can be converted to standard normal with Z = (X - μ)/σ.
- By transforming data to a z-scale, we compare across different datasets on a common footing.
Z-scores and interpretation
- Z-score formula:
- Sign of z indicates whether the value is above (positive) or below (negative) the mean; magnitude indicates distance from the mean in standard deviation units.
- Large |z| indicates a more extreme value relative to the mean.
- A z-score tells you how many standard deviations away a value is from the mean on the standard normal scale (centered at 0 with SD = 1).
Standard normal distribution vs. general normal distribution
- General normal: X ~ 𝒩(μ, σ^2) can take any center μ and spread σ.
- Standard normal: Z ~ 𝒩(0, 1) is the normalized form where μ = 0 and σ = 1.
- The standard normal CDF Φ(z) gives P(Z ≤ z).
Normal distribution in practice (MiniTab workflow described in the lecture)
- Setup for a normal distribution in MiniTab:
- Graph → Probability Distribution Plot → View Probability
- Enter mean μ and standard deviation σ (e.g., μ = 18, σ = 5).
- Choose what you’re solving for: x value (to get P(X ≤ x)) or probability (to find x for a given P).
- Tail selections:
- Left tail: P(X ≤ x)
- Right tail: P(X ≥ x)
- Middle or both tails options exist for certain questions.
- Example interpretation: If X ∼ 𝒩(18, 5^2) and you want P(X < 19.5), you set x = 19.5 and use the left tail.
- When converting to the z-scale for this same question: z = (19.5 - 18)/5 = 0.3, so you evaluate P(Z < 0.3).
- Verification: P(X < 19.5) ≈ P(Z < 0.3); values on the standard normal are used for quick checks or via software.
- Note on using Excel vs MiniTab: MiniTab is highlighted as more straightforward for these plots and data labeling; Excel can be used but may require extra steps.
Four practical examples (Normal distribution, slide 6)
- Setup: Use X ∼ 𝒩(μ, σ^2) with μ = 18, σ = 5 (example values used for demonstrations).
- Example 1: Find P(X < 19.5).
- Method: In MiniTab, input x = 19.5 and left tail.
- Result (as shown in the lecture): ≈ 0.61 (≈ 61%); corresponds to Φ((19.5-18)/5) = Φ(0.3).
- Example 2: Find P(X > 19.5).
- Method: Use right tail; result is 1 - P(X ≤ 19.5) ≈ 0.39 (complement of ~0.61).
- Example 3: Find P(18.2 ≤ X ≤ 19.5).
- Method: Use lower bound 18.2 and upper bound 19.5; can compute as P(X ≤ 19.5) − P(X ≤ 18.2).
- Illustration in the talk shows how to compute using left-tail values or by middle-area approaches.
- Example 4 (inverse question): Given P(X ≤ x) = 0.30, find x.
- MiniTab workflow: set probability (0.30) and solve for x (the x-value that yields 30% area to the left).
- Result shown: x ≈ 15.38 (for μ = 18, σ = 5).
- Another variant in the talk: given P(X ≥ x) = 0.16, find x.
- That’s equivalent to P(X ≤ x) = 0.84, yielding x ≈ 22.97 (for μ = 18, σ = 5).
- Conceptual takeaways from the examples:
- You can either work with x-values to get probabilities, or work with probabilities to solve for x (inverse problem).
- The left-tail vs right-tail choice depends on the question phrasing (e.g., P(X < x) vs P(X > x)).
- For standardization, you can check that P(X ≤ 19.5) ≈ P(Z ≤ 0.3) using the standard normal table or software.
The empirical rule (68-95-99.7) and outliers
- Empirical rule (rough but very useful):
- About 68% of data lie within ±1σ of μ.
- About 95% lie within ±2σ of μ.
- About 99.7% lie within ±3σ of μ.
- Outliers definition via z-scores: data points with |Z| > 3 are typically considered outliers.
- Practical verification in MiniTab: using the same mean and SD, check areas for μ ± σ, μ ± 2σ, μ ± 3σ to see the approximate percentages (~68%, ~95%, ~99.7%).
Uniform distribution (X ∼ U(a, b)) – key properties
- Shape: a rectangle (not bell-shaped); all values in [a, b] are equally likely.
- Probability density: f_X(x) = 1/(b − a) for x ∈ [a, b], and 0 otherwise.
- Lower and upper bounds are fundamental: a is the lower bound, b is the upper bound.
- Two main tasks:
- Probability calculation (area under the rectangle): For c ≤ d within [a, b], P(c ≤ X ≤ d) = (d − c) / (b − a).
- Mean and standard deviation: E[X] = (a + b)/2; σ = (b − a)/√12; though the standard deviation may be less intuitive, you can plug into the calculation as needed.
- Example from the lecture: X ∼ U(2, 6). Probability between 3 and 5 is (5 − 3) / (6 − 2) = 2/4 = 0.50.
- MiniTab workflow for uniform distribution:
- Graph → Probability Distribution Plot → View Probability; select uniform distribution from the list.
- Enter lower bound a and upper bound b (e.g., a = 2, b = 6).
- Use the x-value input to find P(3 ≤ X ≤ 5) by selecting the middle area between 3 and 5.
- Quick practice with a new interval (example): Uniform(50, 120); P(90 ≤ X ≤ 98) = (98 − 90) / (120 − 50) = 8/70 ≈ 0.1143 (about 11.4%). The MiniTab approach will render the rectangle and the shaded area accordingly.
- In class, the instructor emphasized two practical tips for uniform distributions:
- The mean is simply (a + b)/2.
- The standard deviation can be treated as a plug-in value if needed for calculations; memorize the mean formula and the general approach to compute probabilities.
Binomial distribution (context and key formulas referenced from Lecture 4)
- The binomial setting (brief recap from module 4): number of successes in n independent Bernoulli trials with probability p of success each trial.
- Notation: X ∼ Binomial(n, p).
- Mean and variance:
- E[X] = np
- Var(X) = np(1 − p)
- Probability mass function (pmf): P(X = k) = {n all; k} p^k (1 − p)^{n − k}, where {n all; k} = n! / (k!(n − k)!).
- Example discussed: with n = 5 and p = 0.8 (e.g., penalty kicks with 80% success rate), you can compute the distribution of the number of successes; the mean is 4, and the setup can be used to illustrate the expected value and the spread.
- The professor walked through building the binomial table by calculating P(X = k) for k = 0, 1, …, n and illustrating how E[X] = ∑k k P(X = k) gives the mean via a weighted sum (the idea of expected value).
- A brief note on factorials and combinations: {n all; k} = n! / (k!(n − k)!). For example, {5 all; 3} = 5! / (3! 2!) = 10, illustrating the number of ways to get exactly 3 successes in 5 trials.
- The instructor suggested that in exams you are more likely to use software (MiniTab) to obtain binomial probabilities rather than manual arithmetic, but the factorial and combinations concept helps understand the underlying counting principle.
Quiz details (closing section in the transcript)
- Quiz 2 scope: covers modules 3 and 4 (not module 5);
- Window: Friday 9 AM to Sunday 9 PM; 30-minute quiz; open-note allowed; accommodation handled as needed.
- Quick reminder: there is a separate module for module 5 material in class, and quiz 3 will cover module 5 later.
- Office hours: posted time (e.g., Webex) for questions; students should raise questions during office hours if they’re stuck.
- Practical exam strategy mentioned: practice with ACN/AP questions; use similar problems to build familiarity; AI tools can help generate practice questions.
Practical study tips highlighted in the lecture
- Practice with MiniTab early and often; it provides the exact steps and the graph, which helps you understand the area under the normal curve.
- When solving problems, focus on whether you’re given x (and asked for a probability) or given a probability (and asked for an x value); that determines which MiniTab option to use (X value vs probability).
- For normal computations, you can verify results by converting to a z-score and using Φ(z) from standard normal tables or software.
- For empirical rule and outliers, remember the rough percentages and z-thresholds; outliers roughly correspond to |z| > 3.
Quick recall of key formulas (LaTeX notations)
- Z-score transformation:
- Normal distribution notation: ; Standard normal:
- Standard normal relationship: $Z = (X - \mu)/\sigma$ with $Z$ following $\mathcal{N}(0,1)$.
- Uniform distribution: ;
- Mean and SD of Uniform:
- Binomial distribution: ;
- Binomial pmf:
- Empirical rule (approximate): within 1σ: ~68%; within 2σ: ~95%; within 3σ: ~99.7%
- Outlier rule: |Z| > 3 indicates an outlier.
- Example inverse problem (cumulative): if $P(X \le x) = p$, find x; with standard normal, solve $\Phi(z) = p$ where $z = (x-\mu)/\sigma$.
Quick reference values mentioned in the lecture (contextual)
- P(X < 19.5) for X ∼ 𝒩(18, 5^2) ≈ 0.61–0.62 (rough MiniTab result given as ~0.61).
- For z = 0.3 (i.e., X = 19.5 when μ = 18, σ = 5): P(Z < 0.3) ≈ 0.618.
- In an example inversion: P(X ≤ x) = 0.30 yields x ≈ 15.38 (for μ = 18, σ = 5).
- Inversion with right-tail probability: P(X ≥ x) = 0.16 yields x ≈ 22.97 (for μ = 18, σ = 5).
Summary: why these topics matter
- Normal distribution and z-scores enable meaningful comparisons across diverse data (athletic performance, health metrics, exam scores, etc.).
- The standard normal provides a universal reference for probability calculations, enabling quick lookups and cross-checks with software.
- The empirical rule gives quick, practical bounds for typical datasets; the 3σ rule helps flag outliers.
- Uniform and binomial distributions extend these ideas to non-continuous and bounded scenarios, with practical formulas for probabilities, means, and variances.
Final reminder from the instructor
- Module 3 and Module 4 content will feed into Quiz 2; Module 5 material will feed into Quiz 3.
- Practice with MiniTab and similar questions to be prepared for open-note quizzes and time-limited formats.
- Office hours will be available for questions; use AI-assisted practice questions to build familiarity with problem formats and solution steps.