P-Values, Significance, and Study Interpretation
Key Terminology/Concepts
Statistical Significance (SS)
- Determined by the p-value (probability that an observed difference is due to chance alone)
- → statistically significant ((< 1/20) chance of Type I error)
- → highly statistically significant ((< 1/1000) chance)
- Does not measure the magnitude or practical importance of an effect.
- Strongly affected by sample size
- Large → even trivial differences can look “significant.”
- Small → important differences may fail to reach significance.
Clinical (Practical) Significance (CS)
- Judgment-based: “Does the change matter to patients/clinicians?”
- Possible scenarios:
- SS but not CS (large , tiny effect)
- CS but not SS (small , sizeable effect that fails hypothesis test)
Type I vs. Type II Errors
- Type I (α error): Incorrectly reject null (false positive) – “saw an effect that isn’t real.”
- Type II (β error): Incorrectly accept null (false negative) – “missed a real effect.”
Placebo & Placebo Effect
- Inert intervention used as control in blinded trials.
- Placebo effect reflects improvements due to context, expectation, researcher interaction, etc., not the active compound.
Risk‐Based Effect Measures
- (Relative Risk) = risk({exposed}) ⁄ risk({unexposed})
- (Odds Ratio) = odds({exposed}) ⁄ odds({unexposed})
- (Hazard Ratio) = instantaneous risk in one group relative to another over time.
Confidence Interval (CI)
- Range expected to contain true population value with specified probability (e.g., CI).
- Narrower CI → higher precision.
Visual Scale of p-Values
highly SS … 0.0005 0.0006 … 0.0009 | 0.001 | 0.002 0.003 0.004 … 0.05 | not SS >0.05
- A p-value is smaller when further left on the scale.
- Example: 0.004 > 0.001 → significant ((
Interpreting Study Results (General Tips)
- Compare between groups rather than within groups when possible.
- Report actual numbers alongside p-values.
- p-values are qualifiers attached to the result, not stand-alone findings.
- Avoid phrases like “trending toward significance”; is not significant.
- Use “evidence supports,” not “proves.”
Study #1 – Rosmarinic Acid & Seasonal Allergic Rhinoconjunctivitis
- Design: Randomized, placebo-controlled; doses: 50 mg & 200 mg.
- Symptom relief figure (no p-values): perceptions may shift once p-values & sample size (small per group) are revealed.
- Inflammatory cells (Nasal lavage – Table 3)
- Polymorphonuclear leukocytes at Day 3: 50 mg vs placebo (SS); 200 mg vs placebo (SS).
- Many Day 21 values P > 0.05 → not SS.
- Take-away:
- Some SS changes, but small samples & variable results → unclear CS.
- Decision to supplement could change if therapy carried side-effects (contrast to benign herb).
Study #2 – Potassium/Magnesium Citrate & Blood Pressure
- Subjects: Pre-hypertensive/hypertensive; per arm; 4 treatments incl. placebo.
- Central & radial BP, pulse wave velocity → Mixed-model p-values all > 0.05 except:
- Night-time SBP , DBP , HR (statistically significant).
- Magnitude: ~5 mmHg drop (night only) – may not meet clinically relevant thresholds.
- Conclusion: SS ≠ CS; modest, possibly negligible benefit.
Study #3 – Cinnamon for Primary Dysmenorrhea
- Randomized, double-blind, placebo controlled (40 per group → 1 g cinnamon TID × 3 days).
- Baseline demographics balanced (P > 0.05 desired).
- Outcomes via VAS (0–10 pain scale)
- Cycle 1: ΔPain = vs (placebo); .
- Cycle 2: ΔPain = vs ; .
- CS: ~2-point reduction on 10-point VAS may be meaningful.
- Authors discuss sizeable placebo response; suggest longer follow-up to mitigate context effects.
Additional Abstract Snapshots
- Mouthwash with Essential Oils + Curcumin
- Triple-blind; 3 groups (15 each). P < 0.05 for periodontal & RA markers; several P < 0.001 (highly SS).
- Largest improvements varied by parameter (e.g., plaque vs pocket depth).
- Modified Alternate-Day Fasting vs Calorie Restriction (Metabolic Syndrome)
- ; 8 weeks.
- Significant superiority of MADF for weight, waist, SBP, fasting glucose (p-values: 0.003 – 0.029).
- No difference in lipids, insulin resistance.
- CS depends on magnitude (not provided here).
Practice Question Highlights (AMA & Study-Design)
- Correct AMA listing (≥ 7 authors): “Terker AS, Zhang C, McCormick JA, et al.”
- P-value interpretation question: results significant ((P < 0.05)) but not highly significant ((> 0.001)) → BMI, HDL, waist circumference.
- Placebo most commonly appears in cross-over randomized trials.
- Weed with anxiety benefit & LDL harm → begin with animal studies (safety/efficacy) before human.
- Randomization definition: method to allocate participants to control vs intervention arms (reduces selection bias).
Summary Cheat-Sheet
- Not SS: P > 0.05
- SS: 0.05 \ge P > 0.001
- Highly SS:
- Baseline characteristics ideally not SS.
- Always assess both statistical and clinical significance.
- Large can inflate SS; small can mask real effects.
- Placebo effects are real and must be considered in interpretation.
- When reading results, look for: effect size, CI width, risk ratios, and context (patient-important outcomes).