MASAI Trial: AI-assisted Mammography Screening — Clinical Safety & Workload (Bullet Notes)
Background
- MASAI trial evaluates AI-supported screen-reading versus standard double reading in population-based mammography screening.
Study Design and Participants
- Type: randomised, non-inferiority, single-blinded, parallel design.
- Population: women aged 40–80 at four Swedish screening sites; includes general screening intervals of 1.5–2 years and annual screening for moderate hereditary risk or history of breast cancer.
- Randomisation: 1:1 allocation via PACS; automated pseudo-randomization; participants and radiographers masked to allocation; radiologists reading not masked.
- Interventions: AI triage with Transpara v1.7.0: exams with risk scores 1–9 → single reading; score 10 → double reading; CAD marks for scores 8–10; radiologists have AI risk scores and CAD marks at reading time.
- Control: standard unassisted double reading.
- Outcomes: prespecified safety analysis focusing on early screening performance (cancer detection rate, recall rate, false positive rate, PPV of recall, cancer type) and screen-reading workload; primary endpoint is interval cancer rate to be assessed after 2 years in a full 100,000-participant cohort.
- Analysis populations: modified intention-to-treat (mITT); exclusions include technical recalls and non-suspicious lymphoma cases.
Key Safety Findings (Clinical Safety Analysis)
- Enrollment and analysis: 80,033 randomized (AI: 40,003; control: 40,030); 39,996 AI and 40,024 control included in the clinical safety analysis; 13 excluded.
- Cancer detection rate: AI 6.1×10−3 per participant; control 5.1×10−3 per participant; ratio RR=1.2; p=0.052.
- Recall rate: AI 0.022; control 0.020 (i.e., 2.2% vs 2.0%).
- False positive rate: AI 0.015; control 0.015 (1.5% in both).
- Positive predictive value (PPV) of recall: AI 0.283; control 0.248.
- Cancer type among detections: AI invasive 0.75 of cancers; in situ 0.25; control invasive 0.81; in situ 0.19.
- Screen-reading workload: AI readings 46,345 vs control 83,231; reduction 44.3%; total of 36,886 fewer readings.
- Interpretation: AI-supported screening detected more cancers with similar false-positive burden and substantially reduced workload; primary endpoint interval cancer rate to be evaluated in full cohort.
Post-hoc and risk-score–related findings
- High-risk group (risk score 10): cancer detection rate 72.3 per 1000; 208 of 2875 participants with cancers in this subgroup.
- Extra-high-risk group (top 1%, marked as 10H): recalls 189; cancers detected 136; PPV of recall 72.0%; detection rate 277.6 per 1000 in this category; this group contained 55.7% of all screen-detected cancers.
- Single-reading with AI (risk scores 1–9): 92.0% of exams; cancers detected by single-reading with AI: 36 (14.8\% of all cancers detected).
Strengths and Limitations
- Strengths: real-world screening setting; minimal exclusions; large sample; integration with national programme.
- Limitations: single-centre study; one mammography device and one AI system; radiologist experience may limit generalisability; potential automation bias considerations; true false-positive rate will depend on follow-up for interval cancers.
Implications for Practice
- AI-assisted triage plus detection can be considered safe and substantially reduces screen-reading workload without increasing recalls or false positives.
- Primary efficacy endpoint (interval cancer rate) pending in full 100,000-participant cohort with 2-year follow-up.
- Future implementation requires careful strategy (which readings use AI data, thresholds, and monitoring of algorithm performance).
Conclusion
- AI-supported mammography screening demonstrated a similar cancer detection rate to standard double reading with a large reduction in screen-reading workload, supporting potential clinical implementation while awaiting full interval-cancer-rate data.