Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/76858
Title: A data analytic and machine learning approach to diabetes monitoring
Authors: Cilia, Daniel Anthony (2020)
Keywords: Diabetes
Blood sugar monitoring
Machine learning
Algorithms
Issue Date: 2020
Citation: Cilia, D.A. (2020). A data analytic and machine learning approach to diabetes monitoring (Bachelor's dissertation).
Abstract: Introduction: Invasiveness is one of the most prevalent issues affecting Diabetes monitoring systems. Many recent studies are investigating the concept of using Machine Learning (ML) algorithms to predict future blood glucose levels, based on historical physiologic data gathered from an array of sensors [1, 2]. Research questions: i) Could physiologic parameters, gathered from non-invasive sources, be used to improve glucose predictive accuracy? ii) Would the elimination of data gathered in an invasive manner yield clinically acceptable results? Method: Multiple data analyses were conducted using the OhioT1DM Dataset [2]. The first phase comprised of analysing and generating predictions with the aim of improving predictive performance. This was achieved via feature engineering techniques and by splitting the dataset into different feature combinations. Data features were organised into the following: Glucose (G), Insulin Pump (P), Fitness Band (B) and Self reported (S). Different combinations were tested on Multiple Linear Regression and XGBoost models. Results were evaluated for each input combination. In the second phase of experimentation, blood glucose level data was omitted from the input features, and the resultant predictive accuracy was evaluated. Root Mean Squared Error (RMSE), Mean Absolute Relative Difference (MARD), R2 coefficient, and Surveillance Error Grid Analysis (SEGA) were used as evaluation metrics. Results & Evaluation: Experiments with blood glucose level data showed that introducing lags to the featureset presented very signi ficant accuracy improvements. In addition, both Linear Regression and XGBoost displayed much higher accuracy when the training data was halved. Feature ablation experiments produced varying results - particularly, insulin features brought about consistent accuracy gains, whereas fitness band features observed high gains in some patients and decreases in others. Glucose omission experiments showed a very substantial hit in accuracy, and SEGA showed that models trained on such data were unable to produce clinically reliable predictions for any patient. Conclusions: i) Including glucose lags has been shown to strongly bene t accuracy. ii) Predicting future glycaemia without considering present - and past - has been observed to be a very difficult feat that requires further elaboration. iii) Linear Regression models have been observed to place higher weighting on "earlier" lags as the prediction horizon increases.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/76858
Appears in Collections:Dissertations - FacICT - 2020
Dissertations - FacICTCIS - 2020

Files in This Item:
File Description SizeFormat 
20BITSD009.pdf
  Restricted Access
4.54 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.