Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/138579
Title: Position paper : advocating for a structured methodology in developing data-driven predictive models for healthcare - evidence from a large-scale national study
Authors: Agius, Stephen
Cassar, Vincent
Magri, Caroline
Khan, Wasiq
Topham, Luke
Keywords: Mater Dei Hospital (Msida, Malta). Emergency Department
Artificial intelligence -- Medical applications -- Malta
Decision support systems -- Malta
Medical informatics -- Malta
Medicine -- Data processing
Triage (Medicine) -- Malta
Diagnosis -- Decision making
Issue Date: 2025
Publisher: Springer
Citation: Agius, S., Cassar, V., Magri, C., Khan, W., & Topham, L. (2025). Position paper: advocating for a structured methodology in developing data-driven predictive models for healthcare–evidence from a large-scale national study. Health and Technology. Retreived from: https://doi.org/10.1007/s12553-025-01010-5.
Abstract: Background Despite the growing adoption of predictive models in healthcare, the development process is often inconsistent and lacks methodological rigour. Many models are created ad hoc, without transparent handling of missing data, proper validation, or alignment with clinical workflows. These shortcomings have undermined trust, reproducibility, and generalisability, especially in high-stakes environments like emergency care. Objectives This position paper aims to advocate for the adoption of structured, transparent, and reproducible methodologies in the development of predictive models for healthcare. Drawing on a large-scale national study of emergency department (ED) visits in Malta, the paper demonstrates that methodological discipline, guided by data science principles, clinical expertise and an understanding of human decision-making behaviour leads to safer, more trustworthy, and clinically relevant models. Methods Using over 32 million data points from 650,000 ED visits across six years, the study employed a structured modelling pipeline that integrated clinical and administrative data sources. The methodology included Cognitive Task Analysis (CTA) to map triage decision-making, rigorous feature engineering based on clinical workflows, handling of missing data through informed strategies, and robust model validation using XGBoost with stratified cross-validation and calibration analysis. Importantly, domain experts were involved throughout the development lifecycle to ensure clinical relevance and interpretability. Results The structured methodology enabled the development of predictive models that reflected the real-world complexity of ED triage, achieved strong performance, and gained clinician acceptance. The models aligned with staged clinical decision-making and were interpretable, trustworthy, and feasible to scale across healthcare environments. Through transparent documentation, robust calibration, and post-deployment monitoring protocols, the models demonstrated readiness for clinical integration. Conclusions The study confirms that structured, domain-informed methodologies are not only feasible at scale but essential for the responsible deployment of predictive models in healthcare. This approach ensures safety, fosters trust, promotes reproducibility and increases the likelihood that the model is used and adopted in real clinical settings. The authors call on researchers, developers, and regulators to establish such methodologies as the standard for AI and data-driven approaches in healthcare, particularly in high-stakes applications where poor model performance can lead to clinical harm.
URI: https://www.um.edu.mt/library/oar/handle/123456789/138579
Appears in Collections:Scholarly Works - FacEMAMar



Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.