Statistics Practice Problems

I. Average Sleep for Pre-Med Majors

  • Objective: Determine average sleep hours for pre-med majors at University of Texas.

  • Survey Sample: Eight individuals.

  • Data:

    • Hours of sleep: 6, 4, 7, 7, 6, 5, 7, 6

1. Calculations

  • a. Mean Hours of Sleep:

    • Formula: Mean(X)=Xn\text{Mean} (X) = \frac{\sum{X}}{n}

    • Calculation: (6+4+7+7+6+5+7+6)/8=6(6 + 4 + 7 + 7 + 6 + 5 + 7 + 6) / 8 = 6 hours

  • b. Standard Deviation: Given as 1.07 hours.

  • c. Standard Error of the Mean (SEM):

    • Formula: SEM=Standard Deviationn\text{SEM} = \frac{\text{Standard Deviation}}{\sqrt{n}}

    • Calculation: SEM=1.0780.38\text{SEM} = \frac{1.07}{\sqrt{8}} \approx 0.38

    • Confidence Interval (95%): Multiply SEM by 2:

    • 95% CI=2×0.38=0.7695\%\ CI = 2 \times 0.38 = 0.76

  • d. Bar Graph Representation:

    • Mean: 6 hours

    • Error Bars:

      • Upper Limit: mean+2SEM=6+0.76=6.76\text{mean} + 2 \text{SEM} = 6 + 0.76 = 6.76

      • Lower Limit: mean2SEM=60.76=5.24\text{mean} - 2 \text{SEM} = 6 - 0.76 = 5.24

2. Harvard Study Comparison

  • e. Prediction on Standard Deviation:

    • Prediction: It will be higher than University of Texas due to a wider spread of data: hours ranged from 2 to 12.

  • f. Bonus: Calculate Harvard's standard deviation using Excel or calculator.

    • Given result: Standard deviation = 3.5

    • Comparison: Prediction matches; standard deviation is higher due to more variability in the data.

II. Water Consumption in Athletes

  • Objective: Determine if football players consume more water than soccer players.

1. Hypothesis and Data Analysis

  • a. Null Hypothesis (H0): There is no difference in the average ounces of water consumed between football and soccer players.

  • b. Create a bar graph comparing sample means for both players, including 95% CI (± 2SEM).

  • c. Conclusion:

    • Stronger evidence supports rejecting H0, as means differ and error bars are juxtaposed.

  • Quantified Water Consumption:

    • Football: 70 ± 13 ounces

    • Soccer: 64 ± 10 ounces

III. Understanding Standard Error Across Different Sample Sizes

1. Data Set Comparison

  • Given sample sizes:

    • Dataset 1: n = 15

    • Dataset 2: n = 30

    • Dataset 3: n = 45

a. Calculate Standard Error for Each Dataset
  • Formula: Standard Error=Standard Deviationn\text{Standard Error} = \frac{\text{Standard Deviation}}{\sqrt{n}}

  • Standard deviation for all datasets = 2.3

    • a. Dataset 1 (n=15): SE=2.3150.59\text{SE} = \frac{2.3}{\sqrt{15}} \approx 0.59

    • b. Dataset 2 (n=30): SE=2.3300.42\text{SE} = \frac{2.3}{\sqrt{30}} \approx 0.42

    • c. Dataset 3 (n=45): SE=2.3450.34\text{SE} = \frac{2.3}{\sqrt{45}} \approx 0.34

2. Effect of Sample Size on Standard Error

  • As sample size increases, standard error decreases.

    • Justification: Standard error decreases because the standard deviation is divided by the square root of sample size.

IV. Statistical Metrics for Various Data Sets

  • Calculate Median, Variability and Box Plots

1. Identify Medians and Variability

  • a. Median for Dataset A:

  • b. Median for Dataset B:

  • c. Determining which dataset has more variability:

2. Box and Whisker Plot Analysis

  • Analyze student scores comparing medians, lowest, highest scores, and percentage distributions for specified thresholds:

    • a. Lowest score:

    • b. Highest score:

    • c. Median:

    • d. % of students scoring ≤ 70:

    • e. % of students scoring ≤ 80:

V. Melanoma Behavioral Statistics in Lizards

  • Data Analysis of Lizard Push-Ups:

    • Sample: 11 lizards counted

    • Data Points: 300, 105, 400, 521, 411, 500, 550, 52, 315, 75, 370

    • Metrics: Minimum: , Q1: , Q2: , Q3: , Maximum: __

    • Definition of IQR:

VI. Lead Concentration in Rainwater Samples

1. Community Comparison

  • Box and whisker plot exhibits lead concentrations from two communities.

  • Determine lower median concentration community.

  • Analyze variability.

    • a. Community A vs Community B for median and variability.

  • Local health guideline awareness: is lead level of ≥ 10 µg/L included in either IQR?

VII. Social Media Habits of Students

  • Researcher’s Hypothesis: 9th graders use more social media than 11th graders.

1. Data Comparison

  • Sample size: 10 each from both grades.

    • a. 11th grade data points: 10, 25, 85, 37, 42, 44, 55, 56, 59, 201

    • b. 9th grade data points: 40, 75, 10, 54, 66, 90, 42, 205, 77, 84

  • Construct box and whisker plots for both datasets.

  • Comparison Outcomes: Median usage, IQRs for both grades.

  • Predict which grade group utilizes more social media time on average: 9th graders 70.5 mins vs 11th graders 49.5 mins.

VIII. Cancer Research on HER2-postive Breast Cancer Using Drugs

1. Overview of Treatment Effectiveness

  • Normal vs Cancer Cell Differences: Cancer cells possess more HER2 receptors than normal cells, facilitating rapid growth.

2. Processing Results

  • a. Apoptosis following treatment with Herceptin and Perjeta; means and comparison results.

    • Herceptin Data: Mean apoptosis: 20.35, STDEV, SEM

    • Perjeta Data: Mean apoptosis: 25, STDEV, SEM

    • Interpretation of apoptosis data.

3. Combination Treatments Analysis

  • Repeat procedure for Herceptin + Perjeta combining results, calculate means, 95% CI.

    • Comparison to validate treatment effectiveness against controls.

  • Final Recommendation: Study combination treatment due to higher effectiveness with lower side effects than alternative treatments.