Week 2- Significance Tests
Housekeeping
- No lecture on Thursday 4/10.
- Attend lab/discussion sections this week and submit lab assignment via Canvas by 4/14.
- Homework assignment #1 due Monday 4/14 at 11:59pm PST, includes today's content to significance tests for proportion.
- Submissions accepted as Word doc, pdf, or image file (no '.page' files).
- Show your work for eligibility for partial credit; deductions may apply for correct answers without work shown.
Review from Chapter 6: Significance Tests
Distinction Between Estimation and Significance Testing
- Estimation of Population Parameters: Uses sample data to estimate a population parameter (mean, proportion).
- Asks what values of a population parameter are plausible based on sample statistic.
- Significance Testing: Summarizes evidence about a hypothesis using sample data.
- Questions if sample values agree with a prediction about the population.
What is a Hypothesis?
- A hypothesis is a statement about a population, often predicting that a parameter takes a particular numerical value or range.
- Examples:
- American adults work an average of 40 hours per week.
- Higher socioeconomic status correlates with lower chronic illness risk compared to lower status.
- Spending on others increases happiness more than spending on oneself.
Conducting a Significance Test
- Evaluates hypotheses by comparing sample point estimates to values predicted by the hypothesis.
- Asks if the observed sample data would be unusual if the hypothesis were true.
Five Parts of a Significance Test
- Assumptions
- Hypotheses
- Test statistic
- P-value
- Conclusion
1. Assumptions
- Type of Data: Quantitative or categorical.
- Sampling Method: Generally assumes data from randomization.
- Population Distribution: Normal or binary.
- Sample Size: Must be adequate to support the test.
2. Hypotheses
- Null Hypothesis (H0): Assumes population parameter takes specific value (usually indicates no effect).
- Alternative Hypothesis (Ha): Indicates population parameter differs from a set value or falls into a range.
- Tests whether sample data contradict H0 using a proof by contradiction approach.
Directional vs. Non-Directional Hypotheses
- Directional Hypothesis: Indicates a specific direction (e.g., Americans work more than 40 hours on average).
- Non-Directional Hypothesis: Makes no directional claim (e.g., mean work hours not equal to 40).
3. Test Statistic
- Compares sample estimate to H0’s predicted parameter value, indicating distance using standard errors.
4. P-Value
- Quantifies how unusual the observed test statistic is relative to what H0 predicts.
- Small P-values indicate strong evidence against H0.
5. Conclusion
- Report P-value and decide if it’s below a pre-defined cutoff (like α = 0.05).
- Reject H0 if P-value ≤ α.
Errors and Correct Decisions
| Decision about H0 | H0 is true | H0 is false |
|---|
| Reject H0 | Type I error | Correct decision! |
| Do not reject H0 | Correct decision! | Type II error |
Type I Error Probability and α Level
- The chosen α (significance level) controls the Type I error rate.
- Common practice sets α traditionally at 0.05 or, in stricter scenarios, 0.01.
- Lower α decreases the likelihood of Type I errors but increases Type II errors.
Significance Test for a Mean
Assumptions
- Randomization
- Quantitative variable
- Normal distribution (or adequate sample size, n ≈ 30, per the Central Limit Theorem).
- H0: µ = µ0 (e.g., µ0 = 40 hours/week)
- Ha:
- 2-sided: µ ≠ µ0
- 1-sided: µ < µ0 or µ > µ0
Test Statistic Calculation
- Formula: t = \frac{\bar{y} - \mu0}{se} where s_e = \frac{s}{\sqrt{n}}.
- Use the t-distribution instead of z for sample size estimating error due to using sample standard deviation.
P-Value Interpretation and Conclusion
- For a non-directional hypothesis, P-value assesses the two-tail probability of observing the test statistic under H0.
- Decision-making follows the same principle as outlined previously regarding P-values.
Comparison of Two Groups (Chapter 7)
Overview of Comparison
- Involves analyzing differences between two means/proportions across two groups.
- Examples:
- Health outcomes based on college education status.
- Financial habits of children with/without savings accounts.
General Concept
- Conduct inference about the differences to understand variability in the outcomes.
Comparing Two Means
- Investigate whether the mean differs between two groups using the recommended methods outlined:
- Identify response (outcome) and explanatory (independent) variables.
Confidence Intervals for Two Means
- Objective: Find an interval representing the difference between two population means with a specified confidence level.
- The inclusion/exclusion of zero within this interval assists in determining desired outcomes.
Interpretation of Results
- Confidence intervals and significance tests yield complementary insights about any difference in population means.
- A non-zero CI suggests significant differences, while inclusion of zero indicates equal means are plausible in the population.