P-Values, Significance, and Study Interpretation

Key Terminology/Concepts

  • Statistical Significance (SS)

    • Determined by the p-value (probability that an observed difference is due to chance alone)
    • P0.05P \le 0.05 → statistically significant ((< 1/20) chance of Type I error)
    • P0.001P \le 0.001 → highly statistically significant ((< 1/1000) chance)
    • Does not measure the magnitude or practical importance of an effect.
    • Strongly affected by sample size NN
    • Large NN → even trivial differences can look “significant.”
    • Small NN → important differences may fail to reach significance.
  • Clinical (Practical) Significance (CS)

    • Judgment-based: “Does the change matter to patients/clinicians?”
    • Possible scenarios:
    • SS but not CS (large NN, tiny effect)
    • CS but not SS (small NN, sizeable effect that fails hypothesis test)
  • Type I vs. Type II Errors

    • Type I (α error): Incorrectly reject null (false positive) – “saw an effect that isn’t real.”
    • Type II (β error): Incorrectly accept null (false negative) – “missed a real effect.”
  • Placebo & Placebo Effect

    • Inert intervention used as control in blinded trials.
    • Placebo effect reflects improvements due to context, expectation, researcher interaction, etc., not the active compound.
  • Risk‐Based Effect Measures

    • RRRR (Relative Risk) = risk({exposed}) ⁄ risk({unexposed})
    • OROR (Odds Ratio) = odds({exposed}) ⁄ odds({unexposed})
    • HRHR (Hazard Ratio) = instantaneous risk in one group relative to another over time.
  • Confidence Interval (CI)

    • Range expected to contain true population value with specified probability (e.g., 95%95\% CI).
    • Narrower CI → higher precision.

Visual Scale of p-Values

highly SS … 0.0005 0.0006 … 0.0009 | 0.001 | 0.002 0.003 0.004 … 0.05 | not SS >0.05
  • A p-value is smaller when further left on the scale.
  • Example: 0.004 > 0.001 → significant ((

Interpreting Study Results (General Tips)

  • Compare between groups rather than within groups when possible.
  • Report actual numbers alongside p-values.
  • p-values are qualifiers attached to the result, not stand-alone findings.
  • Avoid phrases like “trending toward significance”; P=0.051P = 0.051 is not significant.
  • Use “evidence supports,” not “proves.”

Study #1 – Rosmarinic Acid & Seasonal Allergic Rhinoconjunctivitis

  • Design: Randomized, placebo-controlled; doses: 50 mg & 200 mg.
  • Symptom relief figure (no p-values): perceptions may shift once p-values & sample size (small n10n\approx10 per group) are revealed.
  • Inflammatory cells (Nasal lavage – Table 3)
    • Polymorphonuclear leukocytes at Day 3: 50 mg vs placebo P=0.006P = 0.006 (SS); 200 mg vs placebo P=0.010P = 0.010 (SS).
    • Many Day 21 values P > 0.05 → not SS.
  • Take-away:
    • Some SS changes, but small samples & variable results → unclear CS.
    • Decision to supplement could change if therapy carried side-effects (contrast to benign herb).

Study #2 – Potassium/Magnesium Citrate & Blood Pressure

  • Subjects: Pre-hypertensive/hypertensive; n=30n = 30 per arm; 4 treatments incl. placebo.
  • Central & radial BP, pulse wave velocity → Mixed-model p-values all > 0.05 except:
    • Night-time SBP P=0.03P = 0.03, DBP P=0.05P = 0.05, HR P=0.04P = 0.04 (statistically significant).
  • Magnitude: ~5 mmHg drop (night only) – may not meet clinically relevant thresholds.
  • Conclusion: SS ≠ CS; modest, possibly negligible benefit.

Study #3 – Cinnamon for Primary Dysmenorrhea

  • Randomized, double-blind, placebo controlled (40 per group → 1 g cinnamon TID × 3 days).
  • Baseline demographics balanced (P > 0.05 desired).
  • Outcomes via VAS (0–10 pain scale)
    • Cycle 1: ΔPain = 2.1±0.2-2.1 \pm 0.2 vs 0.8±0.3-0.8 \pm 0.3 (placebo); P=0.001P = 0.001.
    • Cycle 2: ΔPain = 2.5±0.3-2.5 \pm 0.3 vs 0.9±0.3-0.9 \pm 0.3; P=0.002P = 0.002.
  • CS: ~2-point reduction on 10-point VAS may be meaningful.
  • Authors discuss sizeable placebo response; suggest longer follow-up to mitigate context effects.

Additional Abstract Snapshots

  • Mouthwash with Essential Oils + Curcumin
    • Triple-blind; 3 groups (15 each). P < 0.05 for periodontal & RA markers; several P < 0.001 (highly SS).
    • Largest improvements varied by parameter (e.g., plaque vs pocket depth).
  • Modified Alternate-Day Fasting vs Calorie Restriction (Metabolic Syndrome)
    • n=70n = 70; 8 weeks.
    • Significant superiority of MADF for weight, waist, SBP, fasting glucose (p-values: 0.003 – 0.029).
    • No difference in lipids, insulin resistance.
    • CS depends on magnitude (not provided here).

Practice Question Highlights (AMA & Study-Design)

  • Correct AMA listing (≥ 7 authors): “Terker AS, Zhang C, McCormick JA, et al.”
  • P-value interpretation question: results significant ((P < 0.05)) but not highly significant ((> 0.001)) → BMI, HDL, waist circumference.
  • Placebo most commonly appears in cross-over randomized trials.
  • Weed with anxiety benefit & LDL harm → begin with animal studies (safety/efficacy) before human.
  • Randomization definition: method to allocate participants to control vs intervention arms (reduces selection bias).

Summary Cheat-Sheet

  • Not SS: P > 0.05
  • SS: 0.05 \ge P > 0.001
  • Highly SS: P0.001P \le 0.001
  • Baseline characteristics ideally not SS.
  • Always assess both statistical and clinical significance.
  • Large NN can inflate SS; small NN can mask real effects.
  • Placebo effects are real and must be considered in interpretation.
  • When reading results, look for: effect size, CI width, risk ratios, and context (patient-important outcomes).