Hypothesis Testing

Hypothesis Testing: Core Concepts

  • Purpose: A formal process to determine if observed differences or relationships between variables are likely due to chance, guided by predefined criteria.

  • Overall process (6 steps):

    1. Develop null and research hypotheses

    2. Choose a level of significance

    3. Determine which statistical test is appropriate

    4. Run analysis to obtain a test statistic and p value

    5. Make a decision about rejecting or failing to reject the null hypothesis

    6. Make a conclusion

  • Hypotheses are statements that predict the relationship between variables and are testable predictions of the outcome.

  • Hypotheses translate the research question into a prediction of the outcome. (Education guess about an outcome)

Hypotheses: Null and Research

  • Null Hypothesis (H0): There is no difference between groups or no relationship between variables.

  • Research Hypothesis (H1 or Ha): There will be a difference between groups or there will be a relationship between variables.

  • Directional (one-tailed) vs Non-directional (two-tailed) hypotheses:

    • Directional: predicts the direction of the effect (e.g., group A > group B).

    • Non-directional: predicts a difference or relationship without specifying direction.

  • Repetition of the core claim: Hypotheses should be testable and derived from a research question.

Level of Significance (α)

  • Criteria used to determine statistical significance; set before data collection.

  • Common choice: α = 0.05. always used unless told otherwise.

  • Type I error (false positive): probability of concluding there is a difference when there really isn’t.

    • Formal definition: \alpha = P(\text{reject } H0 \mid H0 \text{ true})

  • Type II error (false negative): probability of failing to detect a difference when there really is one.

  • Relationship to significance: α defines the risk of a false positive; a lower α reduces this risk but may reduce power.

  • Clinical vs statistical significance: statistical significance does not always imply clinical (practical) significance.

Type I and Type II Errors; Significance Concepts

  • Type I error: rejecting a true null hypothesis; known as a false positive.

  • Type II error: failing to reject a false null hypothesis; known as a false negative.

  • Power: probability of correctly rejecting a false null hypothesis; \text{Power} = 1 - \beta where \beta = P(\text{fail to reject } H0 \mid H0 \text{ false}).

  • Significance relates to the probability of Type I error; power relates to the ability to detect true effects.

Statistical Tests, Test Statistics, and Assumptions

  • Key factors in choosing a statistical test:

    • Number of variables under study

    • Levels of measurement (nominal, ordinal, interval, ratio)

    • Assumptions of the test (normality, independence, equal variances, etc.)

  • Test statistic: a calculated value used to decide whether to reject H0 (e.g., t, z, F, chi-square depending on test type).

  • p-value: the probability of observing the data, or something more extreme, if H0 is true.

    • Formal definition: p\text{-value} = P(\text{difference at least as extreme as observed} \mid H_0)

  • Example cue: The difference between two groups was statistically significant (p = 0.02).

  • Decision rule depends on the comparison between p-value and α.

Decision Rules and Conclusions

  • Decisions:

    • If the calculated p\text{-value} \lt \alpha, REJECT the null hypothesis.

    • If the calculated p\text{-value} \ge \alpha, FAIL TO REJECT the null hypothesis.

  • Example 1 (illustrative):

    • Null: there is no difference in statistics test scores between male and female students.

    • Research: there is a difference.

    • α = 0.05; p = 0.01 ⇒ 0.01 < 0.05 ⇒ REJECT the null hypothesis.

    • Conclusion: There is a statistically significant difference between male and female test scores.

  • Example 2 (illustrative):

    • Null: there is no difference in statistics test scores between male and female students.

    • Research: there is a difference.

    • α = 0.05; p = 0.08 ⇒ 0.08 > 0.05 ⇒ FAIL TO REJECT the null hypothesis.

    • Conclusion: There is NOT a statistically significant difference between male and female test scores. Accepting the Null Hypothosis

  • Important nuance: Hypotheses are not proven; data can support a hypothesis. Conclusions are about significance, not absolute truth.

Conclusions in Hypothesis Testing

  • Conclusion when H0 is rejected: There IS a difference between groups or a relationship between variables.

  • Conclusion when H0 is not rejected: There is NOT a difference between groups or a relationship between variables (based on the sample).

  • Restatement using the same population context helps avoid overgeneralization.

  • Distinction: Statistical significance does not imply practical or clinical significance.

Worked Scenarios (From Slides)

  • Scenario A:

    • Null: No difference in test scores between male and female students.

    • Research: There is a difference.

    • α = 0.05; result p = 0.01 → REJECT null; conclude a statistically significant difference.

  • Scenario B:

    • Null: No difference between male and female test scores.

    • Research: There is a difference.

    • α = 0.05; result p = 0.08 → FAIL TO REJECT null; conclude no statistically significant difference.

  • Scenario C (Conclusion recap):

    • Hypotheses are not proven; data can support a hypothesis.

    • Distinguish between statistical significance and clinical significance.

Key Formulas and Concepts (LaTeX)

  • Type I error probability: \alpha = P(\text{reject } H0 \mid H0 \text{ true})

  • Type II error probability: \beta = P(\text{fail to reject } H0 \mid H0 \text{ false})

  • Power of a test: \text{Power} = 1 - \beta

  • p-value concept: p\text{-value} = P(\text{difference at least as extreme as observed} \mid H_0)

  • Decision rule (typical): If p < \alpha \Rightarrow \text{Reject } H0; if p \ge \alpha \Rightarrow \text{Fail to reject } H0

  • Common significance example: \alpha = 0.05

  • Example p-values from slides: p = 0.01,\; p = 0.08,\; p = 0.02 (illustrative reference values)

Connections to Broader Course Context

  • Relationship to study design: Ensure hypotheses are aligned with research questions and data collection plans.

  • Foundations: Hypothesis testing builds on probability theory and inferential statistics to make evidence-based conclusions.

  • Practical implications: Beyond statistical decisions, consider whether findings are clinically or practically meaningful and how they inform evidence-based practice.