Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/135523| Title: | Artificial intelligence for team sports |
| Authors: | Saliba, Darren (2024) |
| Keywords: | Sports -- Technological innovations Machine learning Sports -- Statistical methods Data sets Artificial intelligence Soccer Algorithms |
| Issue Date: | 2024 |
| Citation: | Saliba, D. (2024). Artificial intelligence for team sports (Master’s dissertation). |
| Abstract: | Football is one of the world’s most popular sports, with a massive fan base and a yearly revenue of billions of euros. Therefore, accurately predicting the outcomes of football matches has become a crucial task within the field of sports. It has always been a chal‐ lenging task to predict the outcome of a football match, not only for fans but also for experts like bookmakers. There are multiple factors that can significantly influence the result, including the team’s form throughout a season, weather conditions, and playing style. In this dissertation, we aim to provide a comprehensive overview of the differ‐ ent methods employed to predict football match outcomes through the implementation of machine learning algorithms, while also leveraging historical data. Machine learning models have proven to be highly effective in predicting the outcome of football matches since they take into account a wide range of factors. Furthermore, these models use historical data to uncover patterns and trends that can subsequently be used to make predictions. The goal of this dissertation is to predict the full‐time result of a football match. A prediction can be classified into three possible outcomes: win, draw, or loss. The first step in predicting the outcome of a match is to collect and preprocess the data. The data collected focuses on the English Premier League, which is widely recognised as one of the most popular leagues in the world. The data is sourced from Football‐Data, an open‐source platform. In total, four machine learning algorithms are employed, Lo‐ gistic Regression, Random Forest, Extreme Gradient Boosting, and Support Vector Ma‐ chine. These algorithms are trained using an 80:20 ratio split. Initially, a baseline model is defined, employing manual feature selection and default parameters. The accuracies achieved of the models ranged between 49.5% and 55.5%, with the Logistic Regression model performing the best. Then, we conducted an optimisation procedure to fine‐tune the parameters of the achieved models. This resulted in a 55% accuracy for the Sup‐ port Vector Machine model. In the next experiment, we introduced feature selection and dimensionality reduction techniques, such as Forward Feature Selection, and Prin‐ cipal Component Analysis, whilst also keeping the default parameters for each model. The accuracies achieved ranged between 86% and 90%, with the top performer being the Random Forest model. Furthermore, another experiment is performed by combin‐ ing these techniques with an exhaustive grid search to identify the optimal parameters for each model. The Extreme Gradient Boosting model achieved the best accuracy of 94%. Furthermore, besides accuracy, other evaluation metrics are considered to gain a more detailed understanding of the predictive performance of each model. We con‐ cluded that implementing appropriate techniques and selecting optimal parameters can significantly enhance predictive power. |
| Description: | M.Sc.(Melit.) |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/135523 |
| Appears in Collections: | Dissertations - FacICT - 2024 Dissertations - FacICTAI - 2024 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 2519ICTICS520005072487_1.PDF | 1.25 MB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
