Introduction to Hypothesis Testing: Single Sample T-Test and Error Types

Transition from Previous Topics

We have previously discussed descriptive statistics and inferential statistics, specifically confidence intervals. Now, we are moving to the first hypothesis test: a t-test for a single sample, which is the simplest of many to come. By the end of the week, hypothesis testing (and this specific test) will be connected to confidence intervals, demonstrating they are the same method.

Definition of Hypothesis Testing

Hypothesis testing is a statistical method designed for testing and substantiating claims.

Example: Machine Defect Rate
  • Claim: A machine's historical defect rate is 33 in 10001000.

  • New Data: A new machine produces a sample with an average defect rate of 2.72.7 out of 10001000.

  • Question: Is 2.72.7 evidence that the defect rate has decreased, or is it merely random sampling error and not significantly different from 33?

Two Complementary Hypotheses
Null Hypothesis (H0H_0)
  • Represents the initial value of the population mean based on previous experience or conventional wisdom (the status quo).

  • Assumed to be true unless strong evidence points to the contrary.

  • In the defect rate example: The current defect rate is equal to the historical defect rate of 33.

  • Court Case Analogy: The person is innocent. This is the operating principle of the legal system, assuming innocence until proven guilty.

Alternative Hypothesis (H1H_1)
  • The opposite of the null hypothesis and what we are trying to substantiate.

  • In the defect rate example: The current defect rate is not equal to the historical defect rate of 33.

  • Court Case Analogy: The person is guilty.

Language of Hypothesis Testing

When evaluating evidence, we either reject the null hypothesis (if there's enough evidence against it) or fail to reject the null hypothesis (if there isn't enough evidence to reject it).

Detailed Court Case Analogy

Scenario: Deciding if a person is innocent or guilty based on presented evidence, which is often inconclusive.

Two States of the World (Truth)
  1. The person is truly guilty.

  2. The person is truly innocent.

Two Decision Outcomes (Our Action)
  1. Convict the person.

  2. Acquit the person.

Hypotheses in the Analogy
  • H0H_0: The person is innocent (our default assumption).

  • H1H_1: The person is guilty (what we seek to substantiate).

Hypothetical Guilt Scale

A rating from 00 (completely innocent) to 100100 (completely guilty).

Sampling Distribution of Mean Guilt Ratings
  • Imagine many juries (each with 1212 members) evaluating the exact same case.

  • Each jury provides an average guilt likelihood.

  • The distribution of these jury averages is a sampling distribution of the mean guilt ratings.

  • Case 1: Person is Actually Innocent (H0H_0 is true):

    • We would expect most juries to give low average guilt ratings.

    • The sampling distribution would be biased towards lower scores, though some juries might randomly give higher ratings.

  • Case 2: Person is Actually Guilty (H0H_0 is false):

    • We would expect most juries to give high average guilt ratings.

    • The sampling distribution would be biased towards higher scores, though some juries might randomly give lower ratings.

The Decision Problem

The sampling distributions for innocent and guilty people will overlap, meaning some innocent people might look guilty, and vice versa. A decision criterion (threshold) is needed to determine conviction or acquittal.

Four Possible Outcomes and Errors in Hypothesis Testing
Table Summary

States of the World

Decision: Retain H0H_0 (Acquit)

Decision: Reject H0H_0 (Convict)

Person is Innocent

Correct Decision: Innocent & Acquit

Type I Error (α\alpha): Innocent & Convict

Person is Guilty

Type II Error (β\beta): Guilty & Acquit

Correct Decision: Guilty & Convict (Power 1β1-\beta)

Type I Error (α\alpha)
  • Occurs when we reject the null hypothesis (H0H_0) when it is actually true.

  • In court case: Convicting an innocent person.

  • Consequence: An innocent person is wrongly imprisoned.

Type II Error (β\beta)
  • Occurs when we fail to reject the null hypothesis (H0H_0) when it is actually false.

  • In court case: Acquitting a guilty person.

  • Consequence: A guilty person walks free.

Correct Decisions
  • True Negative: Retaining H<em>0H<em>0 when H</em>0H</em>0 is true (Acquitting an innocent person).

  • True Positive (Power of the Test): Rejecting H<em>0H<em>0 when H</em>0H</em>0 is false (Convicting a guilty person). Denoted as 1β1 - \beta. The power of the test is the probability of correctly rejecting a false null hypothesis.

Graphical Representation of Errors and Trade-offs
  • Two overlapping sampling distributions are plotted on a single graph. One distribution represents the null hypothesis (H<em>0H<em>0) and the other represents the alternative hypothesis (H</em>1H</em>1). The decision criterion (critical value) is a point on the x-axis that separates the rejection region from the non-rejection region.

  • The area under the H0H_0 distribution that falls into the rejection region represents the Type I Error (α\alpha).

  • The area under the H1H_1 distribution that falls into the non-rejection region represents the Type II Error (β\beta).

  • The area under the H1H_1 distribution that falls into the rejection region represents the Power (1β1-\beta) of the test.