OAR@UM Collection:

OAR@UM Collection: https://www.um.edu.mt/library/oar/handle/123456789/35618 Thu, 11 Jun 2026 14:13:23 GMT 2026-06-11T14:13:23Z Customer churn prediction for an insurance company https://www.um.edu.mt/library/oar/handle/123456789/39751 Title: Customer churn prediction for an insurance company Abstract: The objective of every company is to remain profitable and to lead the respective industry. This is achieved by attempting to attract new customers and keeping the existing ones. The problem of customer churn poses various types of challenges to a company, depending on the industry in which the company operates. In the insurance industry, the repercussions of customer churn may signify that a customer is lost for several years. Retaining customers in the insurance industry who have purchased a motor policy is even a more challenging issue since the policy is renewed every year and the policy holder could easily switch to another competitor if he is not satisfied with the service. Moreover, by Maltese law, Third Party Only (TPO) policy is minimum obligatory cover for every vehicle and thus the competition is quite high in this industry. The objective of this study was to implement a model to predict those policyholders at risk of switching to another competitor and to determine when this event is most commonly to occur. This analysis was applied to the insurance industry though the approach could be used for any other industry. Various data mining techniques, namely, Decision Trees, Logistic Regression, Naive Bayes, Neural Networks, Random Forest and Support Vector Machine SVM were used in order to predict those customers who are likely to terminate their policies. Random Forest turned out to be the best model for forecasting customer behaviour. Even the techniques, Support Machine Vectors and Decision Trees turned out to be powerful techniques to predict customer churn, reaching not only sufficient accuracy but also require less computational effort to train the model than the other techniques. In addition, apart from predicting whether a customer will renew the policy or not, using these data mining techniques, in this research, survival analysis was used to model time till the event of churn and to establish whether certain characteristics lead to churn more than others. It was concluded that approximately 90% of the policy holders survive the first five years while the majority of the policy holders do not terminate the policy before the expiry date. In addition, it was established that the number of other motor policies are associated with a decreased risk of churn while TPO covers are associated with an increased risk of churn. These identified customers who are at high risk of leaving the company could thus be targeted in marketing campaigns aimed at reducing the rate of churn and as a result increasing profitability. Description: M.SC.ARTIFICIAL INTELLIGENCE Mon, 01 Jan 2018 00:00:00 GMT https://www.um.edu.mt/library/oar/handle/123456789/39751 2018-01-01T00:00:00Z Financial time series forecasting : from machine learning to deep learning https://www.um.edu.mt/library/oar/handle/123456789/39749 Title: Financial time series forecasting : from machine learning to deep learning Abstract: In recent years, technological advances have had an immense effect on the way we conduct ourselves in a vast number of industries. The financial industry is one such industry which is known to have been affected in such a manner, such that almost all financial activity, be it accounting, investing, auditing, or financial modelling, are channelled through some form of technological medium. A vast range of systems have been developed, using an array of fi nancial standards and expert input, enabling users with a financial background to perform their day-to-day tasks in an orderly and efficient manner. Whilst these systems are necessary, there exists another spectrum in the technology industry where, rather than developing systems with in-built specifi c rules, the machine itself learns from past experiences and takes actions accordingly. This branch of science is known as arti cial intelligence, more speci cally, machine learning. In this thesis, we focus on the use of machine learning to forecast financial time series, in particular relating to the stock market. The field of financial time series forecasting is one that has been exploited through various statistical and machine learning techniques, some of which achieved promising results. There exists an assumption that those who achieve the best results do not publish findings in order to be ahead of other traders, which makes this field of study all the more challenging. A number of industry leading classi ers and regressors are implemented, after which we approach this task by using a novel branch of neural network based algorithms known as deep learning. Deep Learning is a new branch of Machine Learning, which has been introduced with the objective of moving Machine Learning closer to one of its original goals: Arti ficial Intelligence. These techniques are known to excel in tasks such as image and text recognition, but have not been exploited as much in the field of finance. Through experimentation, we achieve a number of notable results, the best of which is an accuracy of 81% for long-term trend direction forecasting and 0.012 RMSE for next day price forecast, using so called, traditional Machine Learning methods. The Deep Learning methods fail to reach the levels of accuracy achieved by Logistic Regression and Support Vector Machines. We consider this drop in performance to be mainly due to the complexity of the deep architecture setup, wherein the task at hand may favour a more simple model. Whilst one may think that a complex problem such as stock market prediction should favour a complex model, in reality the almost random nature of the fluctuations may in fact favour a more generalised model with less layers and complexities. Description: M.SC.ARTIFICIAL INTELLIGENCE Mon, 01 Jan 2018 00:00:00 GMT https://www.um.edu.mt/library/oar/handle/123456789/39749 2018-01-01T00:00:00Z Evaluating deep learning and machine learning techniques to predict customer churn within a local retail industry https://www.um.edu.mt/library/oar/handle/123456789/39747 Title: Evaluating deep learning and machine learning techniques to predict customer churn within a local retail industry Abstract: A top priority in any business is a constant need to increase revenue and profitability. Within the retail industry, the main source of revenue is based on the purchases of customers. For this reason, companies need to focus on customer retention. When a customer leaves or churns from a business, the opportunity for potential sales or cross selling is lost. When a customer leaves the business without any form of explanation or notice, the company may find it hard to respond and take corrective action. Ideally companies should be proactive and identify potential churners prior to them leaving. Customer retention has been noted to be less costly than attracting new customers. Therefore, identifying individuals that are likely to churn is of great benefit to the company. Through data available within the Point of Sales (POS), customer transactions may be extracted and buying patterns may be identified. This project demonstrates how through transactional data, features are created and may be defined as significant in predicting churn. By predicting churn, companies may adopt a proactive approach to retaining customers. The data provided within this project pertains to a local supermarket. Therefore the results attained through the various models are based on true data. The novelty of this dissertation is the concept of implementing and comparing Deep Learning algorithms to Machine Learning techniques. Convolution Neural Networks, Deep Neural Networks and Restricted Boltzmann Machine are the selected Deep Learning techniques, whilst Random Forest and Logistic Regression are implemented as Machine Learning algorithms. Furthermore, various datasets are designed to evaluate how the mentioned algorithms perform based on the features designed. The overall accuracy results obtained for the mentioned algorithms are: Random Forest attained an 94%, Restricted Boltzmann Machine obtained 83%, Logistic Regression acquired 77% and Convolution Neural Network attained 74%. The results are satisfactory and may contribute in assisting the supermarket in retaining customers. Description: M.SC.ARTIFICIAL INTELLIGENCE Mon, 01 Jan 2018 00:00:00 GMT https://www.um.edu.mt/library/oar/handle/123456789/39747 2018-01-01T00:00:00Z Predictive analysis of football matches using in-play data https://www.um.edu.mt/library/oar/handle/123456789/39745 Title: Predictive analysis of football matches using in-play data Abstract: Sports betting has emerged as a booming industry driven by the popularity of betting on different scenarios within sporting events. Football is one of the most popular sports that is followed by millions of fans around the world. Its dynamic nature, low-scoring matches and other complex variables that could influence the outcome of a game make it hard to predict the outcome of a match. In recent years, more in-game and detailed statistics have been collected and analysed by professionals of the game. The aim of this study is to investigate the application of machine learning techniques for predicting the fulltime result (Home Win/Draw/Away Win) of football matches at the half-time interval by the use of in-play data. We collect and analyse a rich data set of temporal data from seven seasons of five major European leagues between 2009 and 2016. We focus our research on the application of random forest as the main machine learning technique for this problem. We build a genetic algorithm to perform feature selection and hyper-parameter tuning to investigate if the initial results could be further improved. Finally, we contextualise the data set with pre-match data and analyse how this changes the results and the predictors selected. We find that after feature selection and model tuning, the random forest has a mean accuracy 45.0% (±1.6) on unseen data across the different leagues. With the addition of pre-match data the mean accuracy increased to 46.0% (±2.1), but the results for each league remained similar. We evaluate different models on an unseen data set from the year 2016/17. The tuned random forest using both pre-match and in-game data achieves a mean accuracy of 44.8% across the leagues. The highest accuracy was that of 50.0% on the test sample of the English Premier League. The lowest was that of 40.0% on the French and Spanish leagues. We also converted the random forest classification to a probabilistic prediction based on the output of the underlying decision trees. We compare these probabilities to implied odds from a betting exchange (Betfair) on small sample of matches from the unseen data of the English and Italian leagues. We used the Brier Score function to calculate the accuracy of the predictions. Results show that the accuracy is similar for the English Premier League and Italian Serie A for both the Random Forest and Betfair. This comparable performance may indicate that the Machine Learning predictions are similar to those of the betting exchange markets. Description: M.SC.ARTIFICIAL INTELLIGENCE Mon, 01 Jan 2018 00:00:00 GMT https://www.um.edu.mt/library/oar/handle/123456789/39745 2018-01-01T00:00:00Z