Week 7: Single Population Hypothesis Testing Notes
Learning Outcomes and Course Objectives
LO1: Develop an understanding of the concepts of point estimation.
LO2: Understand the methodology used to estimate the population mean ().
LO3: Understand the methodology used to estimate the population proportion ().
LO4: Understand the fundamental basics of hypothesis testing.
Overview of Weekly Content
Application of the principles of sampling distributions to estimate population parameters.
Understanding the basic steps involved in Hypothesis Testing.
Specific application of hypothesis testing for a population proportion ().
Specific application of hypothesis testing for a population mean ().
Case Study: Regional vs. Urban Migration and Wages
This week uses a specific researcher case study to illustrate hypothesis testing principles:
Background: A researcher finds that approximately of working people previously lived in regional Australia. It is believed this proportion has increased following the COVID-19 pandemic.
Claims: There are claims that pandemic disruptions and restrictions were greatest in capital cities, while job growth was strongest in other (regional) areas.
National Data: The national annual wage for 2023 is estimated to be .
Population of Interest: All working people in Australia.
Sample Characteristics: To test these claims, the researcher collects a random sample of individuals working in 2023.
Variables Recorded:
Region: Coded as if the individual lives in regional Australia and if they live in an urban area.
Wage: Annual wage recorded in dollars.
Summary Statistics and Data Description
The following table summarizes the data collected from the sample of :
Statistic | Wage_All | Wage_Urban | Wage_Regional | No_People_Regional |
|---|---|---|---|---|
Mean | ||||
Median | ||||
Standard Deviation | ||||
Sample Variance | ||||
Range | ||||
Minimum | ||||
Maximum | ||||
n |
From Sampling Distribution to Estimation
Sample statistics serve as reliable and consistent estimators for population parameters:
Sample Mean ( ):
It is an unbiased estimator: .
Variance: .
Consistency: The variance converges toward zero as the sample size () grows.
Sample Proportion ( ):
It is an unbiased estimator: .
Variance: .
Consistency: The variance converges toward zero as the sample size () grows.
Steps of Hypothesis Testing
State the Hypotheses: Define the Null () and Alternative () hypotheses about a population parameter based on the research claim (not the sample statistic).
Specify the Decision Rule: Define the criteria for rejecting the null hypothesis, including the significance level () and the critical value or p-value.
Calculate the Test Statistic: Compare the sample estimate with the hypothesized value. Identify the appropriate distribution (e.g., ).
Apply the Decision Rule: Based on the evidence and the threshold, decide to either reject or not reject the null hypothesis.
Make the Decision and Draw Conclusions: Interpret the statistical result within the context of the real-world problem.
Hypothesis Testing: Population Proportions
Step 1: Set up the Hypothesis
Null Hypothesis (): The statement assumed true. For proportion $p$, it is typically set as equality to a value.
(The proportion is the same as the pre-pandemic level).
Alternative Hypothesis (): The statement considered if is rejected.
Upper-tail Test: H_A: p > 0.234 (Used to test if the proportion has increased).
Lower-tail Test: H_A: p < 0.234 (Used to test if the proportion has decreased).
Two-tail Test: (Used to test if the proportion is different).
Step 2: Decision Rule (Critical Value Approach)
Significance Level (): A predetermined threshold (e.g., ).
Example calculation: For and , the critical value using Excel is
t.INV(0.95, 999) = 1.645.Decision Rule: Reject if the test statistic > 1.645.
Step 3: Calculate the Test Statistic
The test statistic for a population proportion is:
Where:
is the sample proportion ().
is the hypothesized proportion ().
The standard error is .
Calculation:
Step 4 & 5: Decision and Conclusion
Since 2.564 > 1.645, the test statistic falls in the rejection region.
Decision: Reject .
Conclusion: At a significance level, the sample provides sufficient evidence that the proportion of the population living in regional areas has increased from the pre-covid proportion of .
Hypothesis Testing Using the P-value
P-value Definition: If the null hypothesis is true, the p-value is the probability of observing a statistic as extreme as the sample proportion obtained.
Decision Rule: Reject if p\text{-value} < \alpha.
Calculation (Upper-tail): p\text{-value} = P(t_{n-1} > t).
Example Calculation: Using the previous proportion test ( and ):
Excel:
= 1 - T.DIST(2.564, 999, TRUE) = 0.0052.
Since 0.0052 < 0.05, the decision is to Reject .
Hypothesis Testing: Population Mean (Two-tailed Test)
Scenario: Is the average wage in urban areas different from ?
Hypotheses:
Parameters: , .
Test Statistic and Distribution
Formula: where .
Sample data for urban workers: , , .
Calculation:
Distribution: follows .
Decision Rule and Decision
Critical Value Approach: For , the critical values are . Reject if |t| > 1.963.
P-value Approach: p\text{-value} = 2 \times P(t_{729} > 0.452) = 0.6514 (calculated via Excel
T.DIST.2T(ABS(0.452), 729)).Result: 0.4520 < 1.963, and 0.6514 > 0.05.
Decision: Do Not Reject .
Conclusion: At a significance level, there is insufficient evidence to conclude the average wage in urban areas is different from .
Hypothesis Testing: Population Mean (Lower-tail Test)
Scenario: Is the average wage in regional areas less than ?
Hypotheses:
H_A: \mu_1 < 100,000
Parameters: , , .
Test Statistic calculation
Decision Rule and Decision
Critical Value: For and , Critical Value =
-1.6505(using Excelt.INV(0.05, 269)).Rule: Reject if t < -1.6505.
Decision: Since -6.55 < -1.6505, Reject .
P-value: P(t_{269} < -6.55) \approx 0.0000, which is less than .
Conclusion: At a significance level, there is sufficient evidence that the mean wage of working people in regional areas is less than .