Hypothesis Testing Notes

Introduction to Hypothesis Testing

Hypothesis testing is a commonly used inferential procedure that allows researchers to make claims about a population based on sample data.
- Definition: A statistical method that uses sample data to evaluate the validity of a hypothesis about a population parameter. It involves comparing observed data to what would be expected if a null hypothesis were true.

State a hypothesis about a population: Formulate a specific statement about a population parameter that you want to test.
Predict expected sample characteristics based on the hypothesis: Determine what the sample data should look like if the hypothesis is true. This often involves predicting the mean, standard deviation, or other relevant statistics.
Obtain a random sample from the population: Collect data from a representative sample of the population. Random sampling is crucial for ensuring that the sample is unbiased and that the results can be generalized to the population.
Compare the obtained sample data with the prediction made from the hypothesis:
- If consistent, the hypothesis is reasonable: If the sample data are similar to what was predicted, it supports the hypothesis.
- If discrepant, the hypothesis is rejected: If the sample data are significantly different from what was predicted, it suggests that the hypothesis is not correct.

Known population before treatment:
- μ = 80 (population mean before treatment)
- σ = 20 (population standard deviation before treatment)
Unknown population after treatment:
- μ = ? (population mean after treatment, which is unknown)
- σ = 20 (population standard deviation, assumed to remain constant)

Known original population: μ = 80 (mean of the population before treatment)
Unknown treated population: μ = ? (mean of the population after treatment)
Involves a treated sample: A sample of individuals who have undergone some form of treatment or intervention.

Step 1: State the hypotheses: Clearly define the null and alternative hypotheses.
Step 2: Set the criteria for a decision: Determine the significance level (alpha) and identify the critical region.
Step 3: Collect data; compute sample statistics: Gather data from the sample and calculate relevant statistics, such as the sample mean and standard deviation.
Step 4: Make a decision: Compare the sample statistics to the critical region and decide whether to reject or fail to reject the null hypothesis.

Null hypothesis (H_0): In the general population, there is NO treatment effect.
- No change, no difference, no relationship: The treatment has no impact on the population.
- "Nothing happened."
Scientific/Alternative hypothesis (H_1): In the general population, there IS a treatment effect.
- "Something happened."

Distribution of sample means can be:
- Sample means close to the untreated population mean (null hypothesis if H_0 is true): If the null hypothesis is true, the sample means should be similar to the population mean before treatment.
- Sample means not close to the untreated population mean (“very unlikely” if H_0 is true): If the null hypothesis is true, it is very unlikely to observe sample means that are far from the population mean before treatment.
Alpha level (i.e., significance level): A probability value (e.g., 0.05) used to define “very unlikely” outcomes. It represents the probability of rejecting the null hypothesis when it is actually true (Type I error).
Critical region(s): Consist of the extreme sample outcomes that are “very unlikely.”
- Determined by the probability set by the alpha level: The critical region is determined based on the chosen alpha level. For example, if alpha is 0.05, the critical region might consist of the extreme 5% of the distribution.