Pilot Study: Run a small pilot study (e.g., 10-30 cases) to estimate p^.
Maximize p^∗(1−p^):
The maximum value of p^∗(1−p^) occurs when p^=0.5.
Determining Sample Size
Determine the desired confidence level (which affects z∗).
Determine the desired accuracy (margin of error).
Estimate p^ using one of the methods above.
Calculate the required sample size n using the formula.
Tongue Rolling Data
Data collected: 96 people can roll their tongues, 23 cannot.
Total sample size: 119.
Sample proportion: p^=11996.
Data Visualization and Summary
Data type: Categorical (Yes/No).
Appropriate plot: Bar chart.
Y-axis: Frequency.
Point Estimate
Point estimate: The sample proportion, p^=11996.
Confidence Interval
Confidence level: 95% (commonly used).
z∗ value for 95% confidence: 1.96.
Formula for Confidence Interval: p^±z∗∗SE
Where SE=np^∗(1−p^)
Calculating Confidence Interval in Excel
Sample proportion: 96/119 ≈ 0.806723
Standard error: np^∗(1−p^)=0.036
Margin of error: 1.96∗0.036=0.071
Lower limit: 0.806723−0.071=0.74
Upper limit: 0.806723+0.071=0.88
Confidence interval: (0.74, 0.88)
Interpretation
We are 95% confident that the true proportion of STAT101 students who can roll their tongue is between 74% and 88%.
Connection to Dominant Traits
Tongue rolling is said to be a dominant trait, theoretically present in 75% of people.
Since 0.75 falls within the calculated confidence interval (0.74, 0.88), our data does not provide evidence against this theory.
Home Field Advantage
Investigating whether there is a home-field advantage in sports (e.g., baseball).
Advantage: More likely to win at home.
Determining Advantage
Look at the proportion of games won at home.
Null hypothesis (no advantage): proportion of home wins = 50% (0.5).
Baseball data: 2,430 games, home team won in 54.9% of games.
Sample size n=2430
Sample proportion of wins p^=0.549
Hypothesis Testing
We use a hypothesis test to determine if there is enough evidence for a home-field advantage.
A hypothesis test is used to find evidence against a null hypothesis in support of an alternative one.
General Setup for Hypothesis Test for a Proportion
Null hypothesis (H<em>0): p=p</em>0, where p0 is the hypothesized proportion.
Alternative hypothesis:
One-tailed test: p > p0 or p<p</em>0
Two-tailed test: p=p0
Standard Error in Hypothesis Testing
Assumption: The null hypothesis is true.
Use p0 (hypothesized proportion) instead of p^ in the standard error calculation.
Formula: SE=np<em>0∗(1−p</em>0)
Test Statistic (Z-score)
Standardize the sample statistic (p-hat) to calculate a z-score.
Formula: z=SEp^−p0
Checking Assumptions
Verify n∗p<em>0≥10 and n∗(1−p</em>0)≥10 to ensure the sampling distribution of p^ is approximately normal.
Forgot to check this for the tongue-rolling confidence interval (corrected afterwards).
Home Field Advantage Hypothesis Test (Continued)
Hypotheses
Null Hypothesis (H0): The proportion of home wins is 0.5 (p=0.5).
Alternative Hypothesis (Ha): There is a home-field advantage, so the proportion of home wins is greater than 0.5 (p > 0.5). This is a one-tailed test.
Significance Level: α=0.05 (5%).
Test Statistic
Calculate the z-score (standardized test statistic).