Statistics Practice Problems: Normal Distribution and Empirical Rule Study Guide

The Empirical Rule and Normal Distribution Fundamentals

  • The Empirical Rule (also known as the 689599.768-95-99.7 rule) provides a statistical guideline for data that follows a normal distribution.
  • Standard Distribution Segments:     * The area within one standard deviation (μ±1σ\mu \pm 1\sigma) of the mean encompasses approximately 68%68\% of the data. This is divided into two segments of 34%34\% on either side of the mean (μ\mu).     * The area within two standard deviations (μ±2σ\mu \pm 2\sigma) of the mean encompasses approximately 95%95\% of the data. The additional area between 1σ1\sigma and 2σ2\sigma (on both the positive and negative ends) accounts for 13.5%13.5\% each (95%68%=27%95\% - 68\% = 27\%, and 27%2=13.5%\frac{27\%}{2} = 13.5\%).     * The area within three standard deviations (μ±3σ\mu \pm 3\sigma) of the mean encompasses approximately 99.7%99.7\% of the data. The additional area between 2σ2\sigma and 3σ3\sigma accounts for approximately 2.35%2.35\% on each side (99.7%95%=4.7%99.7\% - 95\% = 4.7\%, and 4.7%2=2.35%\frac{4.7\%}{2} = 2.35\%).     * The area beyond three standard deviations (the tails) accounts for the remaining 0.3%0.3\%, which is divided into 0.15%0.15\% in each tail (μ3σ\mu - 3\sigma and μ+3σ\mu + 3\sigma).

Problem 1: Analyzing Employee Commute Times

  • Given Parameters:     * Distribution Type: Normal Distribution.     * Mean (μ\mu): 40minutes40\,\text{minutes}.     * Standard Deviation (σ\sigma): 5minutes5\,\text{minutes}.
  • Objective: Calculate the percentage of employees with a commute time between 3535 and 50minutes50\,\text{minutes}.
  • Step-by-Step Breakdown:     * The lower bound of 35minutes35\,\text{minutes} is equal to μ1σ\mu - 1\sigma (405=3540 - 5 = 35).     * The upper bound of 50minutes50\,\text{minutes} is equal to μ+2σ\mu + 2\sigma (40+(2×5)=5040 + (2 \times 5) = 50).     * Summing the distribution percentages from the Empirical Rule:         * From μ1σ\mu - 1\sigma to μ\mu: 34%34\%         * From μ\mu to μ+1σ\mu + 1\sigma: 34%34\%         * From μ+1σ\mu + 1\sigma to μ+2σ\mu + 2\sigma: 13.5%13.5\%     * Final Calculation: 34%+34%+13.5%=81.5%34\% + 34\% + 13.5\% = 81.5\%.     * Conclusion: Approximately 81.5%81.5\% of employees have a commute time between 3535 and 50minutes50\,\text{minutes}.

Problem 2: Television Consumption Habits in Teenagers

  • Given Parameters:     * Total Sample Size (nn): 600teenagers600\,\text{teenagers}.     * Mean (μ\mu): 14hours/week14\,\text{hours/week}.     * Standard Deviation (σ\sigma): 2.5hours/week2.5\,\text{hours/week}.
  • Objective: Determine how many teenagers in the group of 600600 spend between 99 and 19hours19\,\text{hours} watching TV per week.
  • Distribution Analysis:     * Lower bound calculation: 9hours=14(2×2.5)=μ2σ9\,\text{hours} = 14 - (2 \times 2.5) = \mu - 2\sigma.     * Upper bound calculation: 19hours=14+(2×2.5)=μ+2σ19\,\text{hours} = 14 + (2 \times 2.5) = \mu + 2\sigma.     * According to the Empirical Rule, the area within 22 standard deviations of the mean encompasses 95%95\% of the population.
  • Numerical Estimation:     * Total expected number = 95%×60095\% \times 600.     * Calculation: 0.95×600=5700.95 \times 600 = 570.
  • Conclusion: We would expect 570teenagers570\,\text{teenagers} to spend between 99 and 19hours19\,\text{hours} watching TV each week.

Problem 3: Basketball Performance and Standard Deviations

  • Given Parameters:     * Distribution Type: Normal Distribution.     * Mean Score (μ\mu): 95points95\,\text{points}.     * Standard Deviation (σ\sigma): 2points2\,\text{points}.     * Observed Performance (xx): 89points89\,\text{points}.
  • Performance Assessment (Standard Deviations):     * Calculate the Z-score (the number of standard deviations the value is from the mean).     * Formula: Z=xμσZ = \frac{x - \mu}{\sigma}     * Calculation: Z=89952=62=3Z = \frac{89 - 95}{2} = \frac{-6}{2} = -3.
  • Interpretation:     * A performance of 8989 points is exactly 3standard deviations3\,\text{standard deviations} below the mean score (μ3σ\mu - 3\sigma).     * Conclusion: The team performed significantly below average. In a normal distribution, only 0.15%0.15\% of the population scores lower than 33 standard deviations below the mean, making this an extremely poor performance compared to the average match.

Problem 4: Fitness Performance and Recognition Thresholds

  • Scenario Part A: Group Expectations:     * Sample Size (nn): 400participants400\,\text{participants}.     * Mean (μ\mu): 50push-ups50\,\text{push-ups}.     * Standard Deviation (σ\sigma): 8push-ups8\,\text{push-ups}.     * Target Range: 4242 to 58push-ups58\,\text{push-ups}.     * Range Analysis: 42=508=μ1σ42 = 50 - 8 = \mu - 1\sigma and 58=50+8=μ+1σ58 = 50 + 8 = \mu + 1\sigma.     * Empirical Rule Percentage: 68%68\% of the population falls within one standard deviation.     * Calculation: 0.68×400=2720.68 \times 400 = 272.     * Result: Approximately 272participants272\,\text{participants} are expected to complete between 4242 and 58push-ups58\,\text{push-ups}.
  • Scenario Part B: Medal Eligibility (Top 2.5%):     * Objective: Determine if someone doing 67push-ups67\,\text{push-ups} deserves a medal.     * Criteria: Exceed expectations and perform in the top 2.5%2.5\%.     * Statistical Threshold: In a normal distribution, the area beyond 2standard deviations2\,\text{standard deviations} on the high side is 2.35%+0.15%=2.5%2.35\% + 0.15\% = 2.5\%.     * Upper limit for top 2.5%2.5\%: μ+2σ=50+(2×8)=50+16=66push-ups\mu + 2\sigma = 50 + (2 \times 8) = 50 + 16 = 66\,\text{push-ups}.     * Evaluation: The participant performed 67push-ups67\,\text{push-ups}, which is greater than the threshold of 6666.     * Conclusion: Yes, the participant deserves a medal because their performance of 6767 push-ups places them within the top 2.5%2.5\% of the group.

Problem 5: Nutritional Analysis of Adult Calorie Intake

  • Given Parameters:     * Mean Daily Intake (μ\mu): 2,200calories2,200\,\text{calories}.     * Standard Deviation (σ\sigma): 300calories300\,\text{calories}.
  • Task 1: Percentage between 1,900 and 3,100 calories:     * Lower limit: 1,900=2,200300=μ1σ1,900 = 2,200 - 300 = \mu - 1\sigma.     * Upper limit: 3,100=2,200+(3×300)=μ+3σ3,100 = 2,200 + (3 \times 300) = \mu + 3\sigma.     * Empirical Rule Summation: Percent=(area from μ1σ to μ)+(area from μ to μ+3σ)\text{Percent} = (\text{area from } \mu-1\sigma \text{ to } \mu) + (\text{area from } \mu \text{ to } \mu+3\sigma).     * Calculation: 34%+(34%+13.5%+2.35%)=34%+49.85%=83.85%34\% + (34\% + 13.5\% + 2.35\%) = 34\% + 49.85\% = 83.85\%.     * Result: Roughly 83.85%83.85\% of adults consume between 1,9001,900 and 3,100calories3,100\,\text{calories} per day.
  • Task 2: Expected headcount out of 500 adults:     * Calculation: 0.8385×500=419.250.8385 \times 500 = 419.25.     * Result: We would expect approximately 419adults419\,\text{adults} out of 500500 to be within this range.
  • Task 3: Assessing an intake of 1,300 calories:     * Standard Deviation Calculation: Z=1,3002,200300=900300=3Z = \frac{1,300 - 2,200}{300} = \frac{-900}{300} = -3.     * This intake is exactly 3standard deviations3\,\text{standard deviations} below the mean (μ3σ\mu - 3\sigma).     * Percentage ranking: Only 0.15%0.15\% of the population consumes less than this amount.     * Conclusion: Someone eating 1,300calories1,300\,\text{calories} per day would be considered on the extreme "low side." It is not considered "okay" or average as it represents a significant statistical outlier compared to the typical intake of that region.