Study Notes on Statistical Inference - Population Proportion
Statistical Inference - Population Proportion
Test of Significance for a Proportion
Introduction
The primary focus of these notes is on the statistical inference for population proportions.
A key aspect of this involves executing a Test of Significance for a Proportion.
Four Steps in Carrying Out a Significance Test
State the null and alternative hypotheses.
Check conditions and then calculate the test statistic.
Find the P-value using the appropriate distribution.
Make a decision and state your conclusion in the context of the specific setting of the test.
This structure is essential for conducting a proper significance test.
This systematized approach is endorsed for all tests in statistical inference.
Examples of Significance Tests
Parliamentarian Vote Example:
Null Hypothesis (H0): $p = 0.5$ (The proportion of constituents favoring the proposal is 50%).
Alternative Hypothesis (Hₐ): $p > 0.5$ (More than 50% of constituents favor the proposal).
Pharmaceutical Company Claim:
Null Hypothesis (H0): $p = 0.2$ (The proportion of patients experiencing side effects is 20%).
Alternative Hypothesis (Hₐ): $p < 0.2$ (Less than 20% of patients experience side effects).
Children Raised by Grandparents:
Null Hypothesis (H0): $p = 0.05$ (5% of children are raised by grandparents).
Alternative Hypothesis (Hₐ): $p
eq 0.05$ (The proportion has changed from 5%).
Step 1: State the Hypotheses
Null hypothesis (H0): $p = p0$ (Where $p0$ is the null value.).
Alternative hypothesis (Hₐ): There are three forms depending on the direction:
One-sided, right-sided: $Hₐ: p > p_0$
One-sided, left-sided: $Hₐ: p < p_0$
Two-sided: $Hₐ: p
eq p_0$
Step 2: Condition Check
Step 2a: Check Conditions
Random Sample Requirement:
The sample should ideally be a random sample from the population.
While a random sample is preferred, it is acceptable if the sample is representative of the population relevant to the question.
Sample Size Requirement:
The sample size must be sufficient to ensure the approximate normality of the sampling distribution.
Specifically, check if both $np0$ and $n(1 - p0)$ are at least 10 to confirm this condition.
Step 3: Test Statistic and P-value
Step 3: P-value
The P-value is calculated to determine the strength of the test results.
Step 4: Decision and Conclusion
The decision regarding the null hypothesis is based on the P-value and the pre-defined level of significance, denoted by $$ (alpha).
Decision Rules:
If the P-value $ ext{≤} $, Reject the null hypothesis, concluding there is statistically significant evidence for the alternative hypothesis.
If the P-value $>$, Cannot reject the null hypothesis, concluding that there is insufficient evidence to support the alternative hypothesis.
Example: Mendel’s Peas
Context: Pure bred peas were crossed, resulting in smooth and wrinkled peas. The first generation (F1) was all smooth. In the second generation (F2), the resulting counts were:
Smooth Peas: 5474
Wrinkled Peas: 1850
The hypothesis tested was whether this data supports the 75% dominant trait occurrence.
Test Structure:
Significance level: $ = 0.05$.
Hypotheses:
Null Hypothesis (H0): Proportion of dominant trait = 0.75.
Alternative Hypothesis (Hₐ): Proportion of dominant trait $
eq 0.75$.
Decision:
With a calculated P-value of 0.610, which is greater than $ = 0.05$, we fail to reject H0.
Conclusion: There is no significant evidence, at the 5% level of significance, that the proportion of the dominant (smooth) trait occurring in F2 is different from 75%. Thus, data supports H0 and aligns with the conclusion of a 75% dominant trait occurrence in F2.
Effect of Sample Size on Statistical Significance
Observations based on different sample sizes:
Sample 1: Fails to reject H0.
Sample 2: Rejects H0.
Conclusion: As sample size increases, the likelihood of rejecting the null hypothesis becomes higher.
Conditions Summary
To validate tests conducted on proportions, the following conditions must be checked:
A random sample is needed.
Only use the sample data (no plus four method).
Z-test validity is confirmed if both $np$ and $n(1 - p)$ are at least 10.