Comparing artificial intelligence and human experts in image reporting of breast cancer screening mammograms : a systematic review

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/145873

Title:	Comparing artificial intelligence and human experts in image reporting of breast cancer screening mammograms : a systematic review
Authors:	Aquilina, Nadine (2025)
Keywords:	Artificial intelligence -- Medical applications -- Malta Diagnostic imaging -- Data processing Breast -- Cancer -- Diagnosis Systematic reviews (Medical research)
Issue Date:	2025
Citation:	Aquilina, N. (2025). Comparing artificial intelligence and human experts in image reporting of breast cancer screening mammograms : a systematic review (Bachelor’s dissertation).
Abstract:	Purpose: This study aimed to systematically review the literature comparing the diagnostic performance of standalone Artificial Intelligence (AI) systems to human experts in detecting breast lesions on screening mammograms and evaluate their methodological quality and clinical applicability. Methodology: A comprehensive search was conducted across PubMed, Scopus, Cochrane Central Register of Controlled Trials (EBSCO) and Hydi for studies published between 2019 and 2025. Eligible studies evaluated standalone AI in detecting breast lesions on screening mammograms of asymptomatic women and reported diagnostic outcomes such as sensitivity, specificity, recall rates and area under the curve (AUC) and compared with those of human experts. Methodological quality was assessed using the QUADAS-2 tool, and findings were summarised qualitatively. Results: A total of 21 studies met the inclusion criteria. Results showed that AI systems demonstrated comparable or superior sensitivity to human experts in several studies (90%, vs 91%), with some algorithms detecting cancers missed by radiologists. Specificity, however, was often lower in standalone AI systems (91.2% vs 96%), leading to increased false positive rates (12% vs 4.2%). Studies using enriched datasets tended to report inflated performance metrics compared to those using real-world, population-based screening cohorts. Methodological concerns were commonly identified regarding the retrospective designs, inconsistent follow-up periods, anonymised AI systems, and unfair data access between AI and human readers. Conclusions: Standalone AI has shown promising diagnostic performance in breast lesion detection, particularly in improving sensitivity. However, methodological limitations across studies restrict the generalisability of findings and prevent firm conclusions about AI’s readiness for full clinical adoption. Implications for Practice: Current evidence shows that AI demonstrated comparable diagnostic performances to human experts. However, further research should prioritise prospective, real-world studies, transparent algorithm reporting, and balanced performance evaluation to ensure the safe and effective integration of AI into breast cancer screening workflows.
Description:	B.Sc. (Hons) (Melit.)
URI:	https://www.um.edu.mt/library/oar/handle/123456789/145873
Appears in Collections:	Dissertations - FacHSc - 2025 Dissertations - FacHScRad - 2025

Files in This Item:

File	Description	Size	Format
2508HSCRAD420100016267_1.PDF Restricted Access		1.34 MB	Adobe PDF	View/Open Request a copy

Show full item record Statistics