P-Values, Significance, and Study Interpretation

Key Terminology/Concepts

Statistical Significance (SS)
- Determined by the p-value (probability that an observed difference is due to chance alone)
- $P \le 0.05$ → statistically significant ((< 1/20) chance of Type I error)
- $P \le 0.001$ → highly statistically significant ((< 1/1000) chance)
- Does not measure the magnitude or practical importance of an effect.
- Strongly affected by sample size $N$
- Large $N$ → even trivial differences can look “significant.”
- Small $N$ → important differences may fail to reach significance.
Clinical (Practical) Significance (CS)
- Judgment-based: “Does the change matter to patients/clinicians?”
- Possible scenarios:
- SS but not CS (large $N$ , tiny effect)
- CS but not SS (small $N$ , sizeable effect that fails hypothesis test)
Type I vs. Type II Errors
- Type I (α error): Incorrectly reject null (false positive) – “saw an effect that isn’t real.”
- Type II (β error): Incorrectly accept null (false negative) – “missed a real effect.”
Placebo & Placebo Effect
- Inert intervention used as control in blinded trials.
- Placebo effect reflects improvements due to context, expectation, researcher interaction, etc., not the active compound.
Risk‐Based Effect Measures
- $RR$ (Relative Risk) = risk({exposed}) ⁄ risk({unexposed})
- $OR$ (Odds Ratio) = odds({exposed}) ⁄ odds({unexposed})
- $HR$ (Hazard Ratio) = instantaneous risk in one group relative to another over time.
Confidence Interval (CI)
- Range expected to contain true population value with specified probability (e.g., $95\%$ CI).
- Narrower CI → higher precision.

Visual Scale of p-Values

highly SS … 0.0005 0.0006 … 0.0009 | 0.001 | 0.002 0.003 0.004 … 0.05 | not SS >0.05

A p-value is smaller when further left on the scale.
Example: 0.004 > 0.001 → significant ((

Interpreting Study Results (General Tips)

Compare between groups rather than within groups when possible.
Report actual numbers alongside p-values.
p-values are qualifiers attached to the result, not stand-alone findings.
Avoid phrases like “trending toward significance”; $P = 0.051$ is not significant.
Use “evidence supports,” not “proves.”

Study #1 – Rosmarinic Acid & Seasonal Allergic Rhinoconjunctivitis

Design: Randomized, placebo-controlled; doses: 50 mg & 200 mg.
Symptom relief figure (no p-values): perceptions may shift once p-values & sample size (small $n\approx10$ per group) are revealed.
Inflammatory cells (Nasal lavage – Table 3)
- Polymorphonuclear leukocytes at Day 3: 50 mg vs placebo $P = 0.006$ (SS); 200 mg vs placebo $P = 0.010$ (SS).
- Many Day 21 values P > 0.05 → not SS.
Take-away:
- Some SS changes, but small samples & variable results → unclear CS.
- Decision to supplement could change if therapy carried side-effects (contrast to benign herb).

Study #2 – Potassium/Magnesium Citrate & Blood Pressure

Subjects: Pre-hypertensive/hypertensive; $n = 30$ per arm; 4 treatments incl. placebo.
Central & radial BP, pulse wave velocity → Mixed-model p-values all > 0.05 except:
- Night-time SBP $P = 0.03$ , DBP $P = 0.05$ , HR $P = 0.04$ (statistically significant).
Magnitude: ~5 mmHg drop (night only) – may not meet clinically relevant thresholds.
Conclusion: SS ≠ CS; modest, possibly negligible benefit.

Study #3 – Cinnamon for Primary Dysmenorrhea

Randomized, double-blind, placebo controlled (40 per group → 1 g cinnamon TID × 3 days).
Baseline demographics balanced (P > 0.05 desired).
Outcomes via VAS (0–10 pain scale)
- Cycle 1: ΔPain = $-2.1 \pm 0.2$ vs $-0.8 \pm 0.3$ (placebo); $P = 0.001$ .
- Cycle 2: ΔPain = $-2.5 \pm 0.3$ vs $-0.9 \pm 0.3$ ; $P = 0.002$ .
CS: ~2-point reduction on 10-point VAS may be meaningful.
Authors discuss sizeable placebo response; suggest longer follow-up to mitigate context effects.

Additional Abstract Snapshots

Mouthwash with Essential Oils + Curcumin
- Triple-blind; 3 groups (15 each). P < 0.05 for periodontal & RA markers; several P < 0.001 (highly SS).
- Largest improvements varied by parameter (e.g., plaque vs pocket depth).
Modified Alternate-Day Fasting vs Calorie Restriction (Metabolic Syndrome)
- $n = 70$ ; 8 weeks.
- Significant superiority of MADF for weight, waist, SBP, fasting glucose (p-values: 0.003 – 0.029).
- No difference in lipids, insulin resistance.
- CS depends on magnitude (not provided here).

Practice Question Highlights (AMA & Study-Design)

Correct AMA listing (≥ 7 authors): “Terker AS, Zhang C, McCormick JA, et al.”
P-value interpretation question: results significant ((P < 0.05)) but not highly significant ((> 0.001)) → BMI, HDL, waist circumference.
Placebo most commonly appears in cross-over randomized trials.
Weed with anxiety benefit & LDL harm → begin with animal studies (safety/efficacy) before human.
Randomization definition: method to allocate participants to control vs intervention arms (reduces selection bias).

Summary Cheat-Sheet

Not SS: P > 0.05
SS: 0.05 \ge P > 0.001
Highly SS: $P \le 0.001$
Baseline characteristics ideally not SS.
Always assess both statistical and clinical significance.
Large $N$ can inflate SS; small $N$ can mask real effects.
Placebo effects are real and must be considered in interpretation.
When reading results, look for: effect size, CI width, risk ratios, and context (patient-important outcomes).