Week 2- Significance Tests

Housekeeping

  • No lecture on Thursday 4/10.
  • Attend lab/discussion sections this week and submit lab assignment via Canvas by 4/14.
  • Homework assignment #1 due Monday 4/14 at 11:59pm PST, includes today's content to significance tests for proportion.
  • Submissions accepted as Word doc, pdf, or image file (no '.page' files).
  • Show your work for eligibility for partial credit; deductions may apply for correct answers without work shown.

Review from Chapter 6: Significance Tests

Distinction Between Estimation and Significance Testing

  • Estimation of Population Parameters: Uses sample data to estimate a population parameter (mean, proportion).
    • Asks what values of a population parameter are plausible based on sample statistic.
  • Significance Testing: Summarizes evidence about a hypothesis using sample data.
    • Questions if sample values agree with a prediction about the population.

What is a Hypothesis?

  • A hypothesis is a statement about a population, often predicting that a parameter takes a particular numerical value or range.
  • Examples:
    • American adults work an average of 40 hours per week.
    • Higher socioeconomic status correlates with lower chronic illness risk compared to lower status.
    • Spending on others increases happiness more than spending on oneself.

Conducting a Significance Test

  • Evaluates hypotheses by comparing sample point estimates to values predicted by the hypothesis.
  • Asks if the observed sample data would be unusual if the hypothesis were true.

Five Parts of a Significance Test

  1. Assumptions
  2. Hypotheses
  3. Test statistic
  4. P-value
  5. Conclusion

1. Assumptions

  • Type of Data: Quantitative or categorical.
  • Sampling Method: Generally assumes data from randomization.
  • Population Distribution: Normal or binary.
  • Sample Size: Must be adequate to support the test.

2. Hypotheses

  • Null Hypothesis (H0): Assumes population parameter takes specific value (usually indicates no effect).
  • Alternative Hypothesis (Ha): Indicates population parameter differs from a set value or falls into a range.
    • Tests whether sample data contradict H0 using a proof by contradiction approach.

Directional vs. Non-Directional Hypotheses

  • Directional Hypothesis: Indicates a specific direction (e.g., Americans work more than 40 hours on average).
  • Non-Directional Hypothesis: Makes no directional claim (e.g., mean work hours not equal to 40).

3. Test Statistic

  • Compares sample estimate to H0’s predicted parameter value, indicating distance using standard errors.

4. P-Value

  • Quantifies how unusual the observed test statistic is relative to what H0 predicts.
  • Small P-values indicate strong evidence against H0.

5. Conclusion

  • Report P-value and decide if it’s below a pre-defined cutoff (like α = 0.05).
  • Reject H0 if P-value ≤ α.

Errors and Correct Decisions

Decision about H0H0 is trueH0 is false
Reject H0Type I errorCorrect decision!
Do not reject H0Correct decision!Type II error

Type I Error Probability and α Level

  • The chosen α (significance level) controls the Type I error rate.
  • Common practice sets α traditionally at 0.05 or, in stricter scenarios, 0.01.
  • Lower α decreases the likelihood of Type I errors but increases Type II errors.

Significance Test for a Mean

Assumptions

  • Randomization
  • Quantitative variable
  • Normal distribution (or adequate sample size, n ≈ 30, per the Central Limit Theorem).

Hypotheses for Mean Tests

  • H0: µ = µ0 (e.g., µ0 = 40 hours/week)
  • Ha:
    • 2-sided: µ ≠ µ0
    • 1-sided: µ < µ0 or µ > µ0

Test Statistic Calculation

  • Formula: t = \frac{\bar{y} - \mu0}{se} where s_e = \frac{s}{\sqrt{n}}.
  • Use the t-distribution instead of z for sample size estimating error due to using sample standard deviation.

P-Value Interpretation and Conclusion

  • For a non-directional hypothesis, P-value assesses the two-tail probability of observing the test statistic under H0.
  • Decision-making follows the same principle as outlined previously regarding P-values.

Comparison of Two Groups (Chapter 7)

Overview of Comparison

  • Involves analyzing differences between two means/proportions across two groups.
    • Examples:
    • Health outcomes based on college education status.
    • Financial habits of children with/without savings accounts.

General Concept

  • Conduct inference about the differences to understand variability in the outcomes.

Comparing Two Means

  • Investigate whether the mean differs between two groups using the recommended methods outlined:
    • Identify response (outcome) and explanatory (independent) variables.

Confidence Intervals for Two Means

  • Objective: Find an interval representing the difference between two population means with a specified confidence level.
  • The inclusion/exclusion of zero within this interval assists in determining desired outcomes.

Interpretation of Results

  • Confidence intervals and significance tests yield complementary insights about any difference in population means.
    • A non-zero CI suggests significant differences, while inclusion of zero indicates equal means are plausible in the population.