Introduction to Hypothesis Testing: Single Sample T-Test and Error Types

Transition from Previous Topics

We have previously discussed descriptive statistics and inferential statistics, specifically confidence intervals. Now, we are moving to the first hypothesis test: a t-test for a single sample, which is the simplest of many to come. By the end of the week, hypothesis testing (and this specific test) will be connected to confidence intervals, demonstrating they are the same method.

Definition of Hypothesis Testing

Hypothesis testing is a statistical method designed for testing and substantiating claims.

Example: Machine Defect Rate

Claim: A machine's historical defect rate is $3$ in $1000$ .
New Data: A new machine produces a sample with an average defect rate of $2.7$ out of $1000$ .
Question: Is $2.7$ evidence that the defect rate has decreased, or is it merely random sampling error and not significantly different from $3$ ?

Two Complementary Hypotheses

Null Hypothesis ( $H_0$ )

Represents the initial value of the population mean based on previous experience or conventional wisdom (the status quo).
Assumed to be true unless strong evidence points to the contrary.
In the defect rate example: The current defect rate is equal to the historical defect rate of $3$ .
Court Case Analogy: The person is innocent. This is the operating principle of the legal system, assuming innocence until proven guilty.

Alternative Hypothesis ( $H_1$ )

The opposite of the null hypothesis and what we are trying to substantiate.
In the defect rate example: The current defect rate is not equal to the historical defect rate of $3$ .
Court Case Analogy: The person is guilty.

Language of Hypothesis Testing

When evaluating evidence, we either reject the null hypothesis (if there's enough evidence against it) or fail to reject the null hypothesis (if there isn't enough evidence to reject it).

Detailed Court Case Analogy

Scenario: Deciding if a person is innocent or guilty based on presented evidence, which is often inconclusive.

Two States of the World (Truth)

The person is truly guilty.
The person is truly innocent.

Two Decision Outcomes (Our Action)

Convict the person.
Acquit the person.

Hypotheses in the Analogy

$H_0$ : The person is innocent (our default assumption).
$H_1$ : The person is guilty (what we seek to substantiate).

Hypothetical Guilt Scale

A rating from $0$ (completely innocent) to $100$ (completely guilty).

Sampling Distribution of Mean Guilt Ratings

Imagine many juries (each with $12$ members) evaluating the exact same case.
Each jury provides an average guilt likelihood.
The distribution of these jury averages is a sampling distribution of the mean guilt ratings.
Case 1: Person is Actually Innocent ( $H_0$ is true):
- We would expect most juries to give low average guilt ratings.
- The sampling distribution would be biased towards lower scores, though some juries might randomly give higher ratings.
Case 2: Person is Actually Guilty ( $H_0$ is false):
- We would expect most juries to give high average guilt ratings.
- The sampling distribution would be biased towards higher scores, though some juries might randomly give lower ratings.

The Decision Problem

The sampling distributions for innocent and guilty people will overlap, meaning some innocent people might look guilty, and vice versa. A decision criterion (threshold) is needed to determine conviction or acquittal.

Four Possible Outcomes and Errors in Hypothesis Testing

Table Summary

States of the World	Decision: Retain $H_0$ (Acquit)	Decision: Reject $H_0$ (Convict)
Person is Innocent	Correct Decision: Innocent & Acquit	Type I Error ( $\alpha$ ): Innocent & Convict
Person is Guilty	Type II Error ( $\beta$ ): Guilty & Acquit	Correct Decision: Guilty & Convict (Power $1-\beta$ )

Type I Error ( $\alpha$ )

Occurs when we reject the null hypothesis ( $H_0$ ) when it is actually true.
In court case: Convicting an innocent person.
Consequence: An innocent person is wrongly imprisoned.

Type II Error ( $\beta$ )

Occurs when we fail to reject the null hypothesis ( $H_0$ ) when it is actually false.
In court case: Acquitting a guilty person.
Consequence: A guilty person walks free.

Correct Decisions

True Negative: Retaining $H0$ when $H0$ is true (Acquitting an innocent person).
True Positive (Power of the Test): Rejecting $H0$ when $H0$ is false (Convicting a guilty person). Denoted as $1 - \beta$ . The power of the test is the probability of correctly rejecting a false null hypothesis.

Graphical Representation of Errors and Trade-offs

Two overlapping sampling distributions are plotted on a single graph. One distribution represents the null hypothesis ( $H0$ ) and the other represents the alternative hypothesis ( $H1$ ). The decision criterion (critical value) is a point on the x-axis that separates the rejection region from the non-rejection region.
The area under the $H_0$ distribution that falls into the rejection region represents the Type I Error ( $\alpha$ ).
The area under the $H_1$ distribution that falls into the non-rejection region represents the Type II Error ( $\beta$ ).
The area under the $H_1$ distribution that falls into the rejection region represents the Power ( $1-\beta$ ) of the test.