MASAI Trial: AI-assisted Mammography Screening — Clinical Safety & Workload (Bullet Notes)

Background

  • MASAI trial evaluates AI-supported screen-reading versus standard double reading in population-based mammography screening.

Study Design and Participants

  • Type: randomised, non-inferiority, single-blinded, parallel design.
  • Population: women aged 40–80 at four Swedish screening sites; includes general screening intervals of 1.5–2 years and annual screening for moderate hereditary risk or history of breast cancer.
  • Randomisation: 1:1 allocation via PACS; automated pseudo-randomization; participants and radiographers masked to allocation; radiologists reading not masked.
  • Interventions: AI triage with Transpara v1.7.0: exams with risk scores 1–9 → single reading; score 10 → double reading; CAD marks for scores 8–10; radiologists have AI risk scores and CAD marks at reading time.
  • Control: standard unassisted double reading.
  • Outcomes: prespecified safety analysis focusing on early screening performance (cancer detection rate, recall rate, false positive rate, PPV of recall, cancer type) and screen-reading workload; primary endpoint is interval cancer rate to be assessed after 2 years in a full 100,000-participant cohort.
  • Analysis populations: modified intention-to-treat (mITT); exclusions include technical recalls and non-suspicious lymphoma cases.

Key Safety Findings (Clinical Safety Analysis)

  • Enrollment and analysis: 80,033 randomized (AI: 40,003; control: 40,030); 39,996 AI and 40,024 control included in the clinical safety analysis; 13 excluded.
  • Cancer detection rate: AI 6.1×1036.1\times10^{-3} per participant; control 5.1×1035.1\times10^{-3} per participant; ratio RR=1.2RR=1.2; p=0.052p=0.052.
  • Recall rate: AI 0.0220.022; control 0.0200.020 (i.e., 2.2% vs 2.0%).
  • False positive rate: AI 0.0150.015; control 0.0150.015 (1.5% in both).
  • Positive predictive value (PPV) of recall: AI 0.2830.283; control 0.2480.248.
  • Cancer type among detections: AI invasive 0.750.75 of cancers; in situ 0.250.25; control invasive 0.810.81; in situ 0.190.19.
  • Screen-reading workload: AI readings 46,34546{,}345 vs control 83,23183{,}231; reduction 44.3%44.3\%; total of 36,88636{,}886 fewer readings.
  • Interpretation: AI-supported screening detected more cancers with similar false-positive burden and substantially reduced workload; primary endpoint interval cancer rate to be evaluated in full cohort.

Post-hoc and risk-score–related findings

  • High-risk group (risk score 10): cancer detection rate 72.3 per 100072.3\text{ per }1000; 208 of 2875 participants with cancers in this subgroup.
  • Extra-high-risk group (top 1%, marked as 10H): recalls 189189; cancers detected 136136; PPV of recall 72.0%72.0\%; detection rate 277.6 per 1000277.6\text{ per }1000 in this category; this group contained 55.7%55.7\% of all screen-detected cancers.
  • Single-reading with AI (risk scores 1–9): 92.0%92.0\% of exams; cancers detected by single-reading with AI: 3636 (14.8\% of all cancers detected).

Strengths and Limitations

  • Strengths: real-world screening setting; minimal exclusions; large sample; integration with national programme.
  • Limitations: single-centre study; one mammography device and one AI system; radiologist experience may limit generalisability; potential automation bias considerations; true false-positive rate will depend on follow-up for interval cancers.

Implications for Practice

  • AI-assisted triage plus detection can be considered safe and substantially reduces screen-reading workload without increasing recalls or false positives.
  • Primary efficacy endpoint (interval cancer rate) pending in full 100,000-participant cohort with 2-year follow-up.
  • Future implementation requires careful strategy (which readings use AI data, thresholds, and monitoring of algorithm performance).

Conclusion

  • AI-supported mammography screening demonstrated a similar cancer detection rate to standard double reading with a large reduction in screen-reading workload, supporting potential clinical implementation while awaiting full interval-cancer-rate data.