Statistical Significance Testing

Statistical Significance Testing: A methodology to quantify whether a result is likely due to chance (random sampling error) or due to some factor of interest.
- It clarifies whether observed differences between treatment and control groups arise from random variance or are indicative of a true effect.

Confidence Intervals
- Communicates PRECISION by providing a range of plausible values for the population parameter being estimated.
- Answers the question: “How sure are we about our finding?”
Effect Sizes
- Communicates STRENGTH by indicating the magnitude of the experimental effect, or relationship (e.g., correlation), or odds between variables.
- Answers the question: “How big (or small) was the effect or relationship?”
Statistical Significance
- Communicates PROBABILITY by telling us how likely the current result would be if the study’s null hypothesis were true.
- Answers the question: “Do we think something happened?”

Random Sampling Error
- Refers to the natural deviations that occur when randomly sampling from the population.
Bias
- Involves flawed sampling procedures where the researcher does not use a representative sample.

Definition: The distribution of a sample statistic (e.g., mean) that would be obtained if all possible samples of the same size (N) were drawn from a given population.
Central Limit Theorem: States that
1. The mean of the sampling distribution will equal the mean of the population.
2. If the sample size is sufficiently large, the sampling distribution tends to be normal regardless of the shape of the original population distribution.
3. As the sample size increases, the standard deviation of the sampling distribution (standard error) decreases.

Helps evaluate whether sample differences reflect true effects or are solely due to random sampling error.
Assesses the probability of the current result if the null hypothesis (HO) were true.

Null Hypothesis (H0): Suggests no difference or effect (e.g., HO: No difference).
Alternative Hypothesis (H1): Indicates the presence of a difference or effect (e.g., H1: Difference exists).

Type I Error: Occurs when the null hypothesis is rejected when true (also known as alpha (α) or Level of Significance).
Type II Error: Happens when the null hypothesis fails to be rejected when false (also known as beta (β)).
Power of a Test: The ability of a test to correctly reject a null hypothesis when it is false, represented mathematically as 1 - β.

A p-value less than or equal to alpha (typically ≤ 0.05) signifies statistical significance, leading to rejection of the null hypothesis.
A p-value greater than alpha (typically > 0.05) indicates non-significance, resulting in a failure to reject the null hypothesis.
Key Insight: p-values are often misunderstood as the probability that the null hypothesis is true. In reality, it is the probability of the observed results if the null hypothesis were true.

Statistical significance testing can be vulnerable, especially if sample sizes are large, as it can lead researchers to find statistically significant differences too easily.
Emphasis on statistical significance alone may not provide comprehensive insight; thus, researchers should compare statistical significance with confidence intervals and effect sizes to appreciate the full context and clinical relevance of the results.
- Conclusion: Interpret statistical significance alongside additional metrics for a holistic understanding of findings.