Alpha Levels, Statistical Significance, and Hypothesis Testing
Fundamental Definition and Notation of Alpha Levels
- Alpha Level Defined: The alpha level (α) is the predefined threshold value used in a hypothesis test to determine whether the evidence is strong enough to reject the null hypothesis (H0).
- Notation Alert: The first Greek letter, α, is used exclusively in Statistics to represent this threshold value. It is commonly referred to in academic and professional settings as the "alpha level."
- P-Value and Center Relationship: In the context of hypothesis testing, a small P-value indicates that the observed sample statistic (p^) is located far from the center represents the null hypothesis (H0).
- Standard Values: The most common numerical values assigned to the alpha level include:
* 0.10
* 0.05 (often the default)
* 0.01
* 0.001
The Logic of Rejecting the Null Hypothesis
- Firm Decision Making: Statistics is used to make concrete choices among alternatives. Examples of high-stakes and low-stakes decision making include:
* Legal Systems: A jury must determine if evidence meets the specific threshold of "beyond a reasonable doubt."
* Business Operations: A company must evaluate metrics to select a specific Web design.
* Personal Decisions: A student deciding which specific section of a Statistics course to enroll in.
- Human Perception of Probability: As humans, we are naturally suspicious of rare events occurring under a specific set of assumptions.
- The Rejection Mechanism: A small P-value informs the researcher that the data collected are statistically rare, assuming the null hypothesis is true. If the data are deemed "rare enough," the researcher concludes that the event could not have happened due to mere chance. Consequently, something must be wrong with the initial assumption (H0), leading to the rejection of the null hypothesis.
- Statistical Significance: When the calculated P-value falls below the chosen alpha level threshold (P < \alpha), the results are formally defined as being statistically significant.
Strategic Selection of Alpha Levels
- Researcher Obligation: Selecting an alpha level is an arbitrary choice made by the researcher, but it is also considered an "obligation" to consider the specific situation carefully when choosing that threshold.
- Context-Dependent Criteria: The severity of the consequences of a decision determines the necessary alpha level:
* High-Stakes Safety (e.g., Airplane Rivets or Air Bags): In scenarios assessing the safety of air bags, a extremely low alpha level is required. Even a value of α=0.01 may not be low enough given the risk to human life.
* Low-Stakes Preferences (e.g., Potato Chips or Pizza): If a researcher is simply investigating whether people prefer pizza with or without pepperoni, a higher alpha level, such as α=0.10, may be perfectly acceptable.
- Timing of the Selection: It is a critical ethical requirement to select the alpha level before looking at the data. Failing to do so can lead to accusations of "cheating" or "tuning" the alpha level to artificially suit the observed data and force a significant result.
- The Default Value: While difficult to justify in some specific contexts, many researchers arbitrarily choose α=0.05 (5%) as the default standard.
Historical Origins: Sir Ronald Fisher
- The Founder: Sir Ronald Fisher (1890-1962) was one of the primary founders of modern Statistics.
- The Definitive Text: In 1931, Fisher published a foundational book titled The Design of Experiments.
- The Origin of 0.05: In this text, Fisher discussed the amount of evidence required to reject a null hypothesis. He noted that while the threshold is situation-dependent, he remarked "somewhat casually" that for many scientific applications, a probability of 1 out of 20 (0.05) might be a reasonable value.
- Legacy: Since that casual remark, many individual researchers and entire scientific disciplines have treated the value of 0.05 as "sacrosanct."
Practical Examples and Visual Interpretations
- Interpreting Specific P-Values:
* Scenario A: A P-value of 7% (0.07) is considered a "large" P-value relative to a standard α=0.05. In this case, the result is not significant.
* Scenario B: A P-value of 1% (0.01) is considered a "small" P-value. This indicates that the P-value is significantly far from the null hypothesis center, making the result statistically significant.
- Comparative Context Exercise:
* Researchers must weigh the "cost" of being wrong. Testing the quality of potato chips requires much less stringent evidence than testing the structural integrity of airplane rivets.