Previous Research: Use p\,\hat{} from a similar study.
Pilot Study: Run a small pilot study (e.g., 10-30 cases) to estimate p\,\hat{}.
Maximize p\,\hat{} * (1 - p\,\hat{}):
The maximum value of p\,\hat{} * (1 - p\,\hat{}) occurs when p\,\hat{} = 0.5.
Determining Sample Size
Determine the desired confidence level (which affects z^*).
Determine the desired accuracy (margin of error).
Estimate p\,\hat{} using one of the methods above.
Calculate the required sample size n using the formula.
Tongue Rolling Data
Data collected: 96 people can roll their tongues, 23 cannot.
Total sample size: 119.
Sample proportion: p\,\hat{} = \frac{96}{119}.
Data Visualization and Summary
Data type: Categorical (Yes/No).
Appropriate plot: Bar chart.
Y-axis: Frequency.
Point Estimate
Point estimate: The sample proportion, p\,\hat{} = \frac{96}{119}.
Confidence Interval
Confidence level: 95% (commonly used).
z^* value for 95% confidence: 1.96.
Formula for Confidence Interval: p\,\hat{} \pm z^* * SE
Where SE = \sqrt{\frac{p\,\hat{} * (1 - p\,\hat{})}{n}}
Calculating Confidence Interval in Excel
Sample proportion: 96/119 ≈ 0.806723
Standard error: \sqrt{\frac{p\,\hat{} * (1 - p\,\hat{})}{n}} = 0.036
Margin of error: 1.96 * 0.036 = 0.071
Lower limit: 0.806723 - 0.071 = 0.74
Upper limit: 0.806723 + 0.071 = 0.88
Confidence interval: (0.74, 0.88)
Interpretation
We are 95% confident that the true proportion of STAT101 students who can roll their tongue is between 74% and 88%.
Connection to Dominant Traits
Tongue rolling is said to be a dominant trait, theoretically present in 75% of people.
Since 0.75 falls within the calculated confidence interval (0.74, 0.88), our data does not provide evidence against this theory.
Home Field Advantage
Investigating whether there is a home-field advantage in sports (e.g., baseball).
Advantage: More likely to win at home.
Determining Advantage
Look at the proportion of games won at home.
Null hypothesis (no advantage): proportion of home wins = 50% (0.5).
Baseball data: 2,430 games, home team won in 54.9% of games.
Sample size n = 2430
Sample proportion of wins p\,\hat{} = 0.549
Hypothesis Testing
We use a hypothesis test to determine if there is enough evidence for a home-field advantage.
A hypothesis test is used to find evidence against a null hypothesis in support of an alternative one.
General Setup for Hypothesis Test for a Proportion
Null hypothesis (H0): p = p0, where p_0 is the hypothesized proportion.
Alternative hypothesis:
One-tailed test: p > p0 or p < p0
Two-tailed test: p \neq p_0
Standard Error in Hypothesis Testing
Assumption: The null hypothesis is true.
Use p_0 (hypothesized proportion) instead of p\,\hat{} in the standard error calculation.
Formula: SE = \sqrt{\frac{p0 * (1 - p0)}{n}}
Test Statistic (Z-score)
Standardize the sample statistic (p-hat) to calculate a z-score.
Formula: z = \frac{p\,\hat{} - p_0}{SE}
Checking Assumptions
Verify n * p0 \geq 10 and n * (1 - p0) \geq 10 to ensure the sampling distribution of p\,\hat{} is approximately normal.
Forgot to check this for the tongue-rolling confidence interval (corrected afterwards).
Home Field Advantage Hypothesis Test (Continued)
Hypotheses
Null Hypothesis (H_0): The proportion of home wins is 0.5 (p = 0.5).
Alternative Hypothesis (H_a): There is a home-field advantage, so the proportion of home wins is greater than 0.5 (p > 0.5). This is a one-tailed test.
Significance Level: \alpha = 0.05 (5%).
Test Statistic
Calculate the z-score (standardized test statistic).
Formula: z = \frac{p\,\hat{} - p0}{\sqrt{\frac{p0(1 - p_0)}{n}}}