Study Notes on Statistical Significance and P-Values

Overview of Statistical Significance and P-Values

  • Last episode: Discussed p-values as a determinant of statistical significance.

  • Focus of this module: Exploring the concept of significance beyond just p-values.

Importance of Significance

  • Definition: "What’s the probability that a significant p-value indicates a true effect?"

  • Key question: Positive Predictive Value of a significant p-value.

    • Rephrase: Given a significant p-value, what is the probability that it stems from a real effect?

Statistical Study Example

  • Hypothetical situation: Conducted 1000 studies of various tests.

    • 200 studies had a real effect.

    • 800 studies had no effect.

Type I Error (α) Consideration

  • Type I error (α) set to 0.05:

    • Of the 800 tests with no effect:

    • 95% = true negatives (no effect).

    • 5% = false positives (showing significant results although there is no effect).

Power of the Study
  • Assuming a power of 80% (common in health studies):

    • Out of the 200 real effects:

    • 160 studies = true positives (significant and show an effect).

    • 40 studies = false negatives (real effects that are not significant).

Analysis of Significant Results

  • Calculation of significant results:

    • Total significant results = 200 (true positives + false negatives).

    • 20% of the significant results (40/200) were false negatives (non-significant).

  • The false detection rate was 20%, not the expected 5%.

Positive Predictive Value

  • Positive Predictive Value (PPV): The chance that a p-value of 0.05 indicates a true effect is 80% when power is set to 80%.

    • Notably, average power in psychology studies is around 35%

    • In neuroscience, estimated to be only about 21%.

Impact of Lower Power Assumptions
  • With a 25% power and p=0.05:

    • Outcome results:

    • 100 true positives

    • 80 false positives

    • This would yield a false discovery rate of 44.4%, highlighting the inadequacy of simply relying on p=0.05.

Issues of P-Hacking

  • P-Hacking: Refers to questionable research practices to influence findings and push them over the p=0.05 threshold.

    • Can occur either intentionally or unintentionally.

  • If 15% of studies are p-hacked:

    • Even with 80% power, the false discovery rate could rise to 48.1%.

    • Nearly half of studies showing p=0.05 could be false positives.

Validity of Research Findings

  • John Ioannidis’ seminal paper (2005) titled "Why most published research findings are false":

    • Reviews the dynamics that lead to flawed research interpretations.

Statistical Findings on Trials
  • Considering well-performed, adequately powered Randomized Controlled Trials (RCTs) with a pre-study ratio of 1:1:

    • False discovery rate: 15% with p=0.05.

    • Confirmatory meta-analyses of good quality RCTs: 14.6% false discovery rate.

  • Meta-analyses of small inconclusive studies: 59.4% false discovery rate, suggesting significant results are often invalid.

Findings from Underpowered Studies
  • Underpowered RCTs with proper execution may still have a false discovery rate as high as 76.5%.

    • This means a significant p=0.05 result is three times more likely to be wrong.

  • Poorly performed underpowered studies: 82.5% false discovery rate.

    • More than four times more likely to be incorrect than correct.

Exploratory Research and High Dimensional Data

  • Ioannidis criticizes exploratory research using massive databases as akin to a "fishing expedition".

    • Example: Testing 30,000 genes with an expectation of finding 30 significant results can lead to a 99.9% false discovery rate, despite having a p-value of 0.05.

Implications for Reproducibility

  • Emphasizing the need for retesting results that meet the p=0.05 criterion.

  • There’s a misunderstanding that p=0.05 indicates an unequivocal true effect. This is false.

  • The p-value represents the probability of obtaining results at least as extreme as observed under the assumption that the null hypothesis is true.

Conclusion on Research Findings

  • To improve confidence in results, it is vital to:

    • Recognize the potential flaws in perceived significant findings.

    • Understand that repeated testing and validation are essential.

  • Simply accepting a p=0.05 result without further verification is a clear path to error.