Identifying optimal investment strategies with deep reinforcement learning

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/141020

Title:	Identifying optimal investment strategies with deep reinforcement learning
Authors:	Gauci, Mia (2025)
Keywords:	Reinforcement learning Deep learning (Machine learning) Investments -- Data processing Neural networks (Computer science)
Issue Date:	2025
Citation:	Gauci, M. (2025). Identifying optimal investment strategies with deep reinforcement learning (Bachelor’s dissertation).
Abstract:	The rise of fully automated trading systems has transformed global financial markets, placing greater emphasis on intelligent data-driven decision making. This thesis explores the development of optimal investment strategies using Deep Reinforcement Learning (DRL), with a particular focus on the Proximal Policy Optimisation (PPO) algorithm. Historical closing price data from a diversified portfolio of seven technology stocks was collected, processed and combined with market indicators to form the model inputs. A supervised learning baseline was first established using a Multilayer Perceptron (MLP) to provide a performance benchmark. Subsequently, DRL agents that incorporate different neural network architectures, MLPs, Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), were implemented within a custom PPO framework designed for multiple stock portfolio management. Each model was evaluated using two state representations: normalised closing prices and normalised engineered market features, enabling a comparison of model performance under varying input dimensions. The evaluation was carried out using Monte Carlo rollouts in a custom simulated trading environment using 2023 test data. The PPO agent with an MLP architecture and engineered features achieved the most stable returns, averaging a gain of 152%, while the CNN-based agent with closing price-only input reached a maximum return of 265% but with a higher volatility. These results suggest that, when the models are appropriately structured and trained, DRL agents can outperform both traditional supervised learning approaches and passive strategies in simulated markets, offering a promising foundation for further research into adaptive algorithmic portfolio optimisation.
Description:	B.Eng. (Hons)(Melit.)
URI:	https://www.um.edu.mt/library/oar/handle/123456789/141020
Appears in Collections:	Dissertations - FacEng - 2025 Dissertations - FacEngSCE - 2025

Files in This Item:

File	Description	Size	Format
Mia Gauci.PDF Restricted Access		10.38 MB	Adobe PDF	View/Open Request a copy

Show full item record Statistics