Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/141020
Title: Identifying optimal investment strategies with deep reinforcement learning
Authors: Gauci, Mia (2025)
Keywords: Reinforcement learning
Deep learning (Machine learning)
Investments -- Data processing
Neural networks (Computer science)
Issue Date: 2025
Citation: Gauci, M. (2025). Identifying optimal investment strategies with deep reinforcement learning (Bachelor’s dissertation).
Abstract: The rise of fully automated trading systems has transformed global financial markets, placing greater emphasis on intelligent data-driven decision making. This thesis explores the development of optimal investment strategies using Deep Reinforcement Learning (DRL), with a particular focus on the Proximal Policy Optimisation (PPO) algorithm. Historical closing price data from a diversified portfolio of seven technology stocks was collected, processed and combined with market indicators to form the model inputs. A supervised learning baseline was first established using a Multilayer Perceptron (MLP) to provide a performance benchmark. Subsequently, DRL agents that incorporate different neural network architectures, MLPs, Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), were implemented within a custom PPO framework designed for multiple stock portfolio management. Each model was evaluated using two state representations: normalised closing prices and normalised engineered market features, enabling a comparison of model performance under varying input dimensions. The evaluation was carried out using Monte Carlo rollouts in a custom simulated trading environment using 2023 test data. The PPO agent with an MLP architecture and engineered features achieved the most stable returns, averaging a gain of 152%, while the CNN-based agent with closing price-only input reached a maximum return of 265% but with a higher volatility. These results suggest that, when the models are appropriately structured and trained, DRL agents can outperform both traditional supervised learning approaches and passive strategies in simulated markets, offering a promising foundation for further research into adaptive algorithmic portfolio optimisation.
Description: B.Eng. (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/141020
Appears in Collections:Dissertations - FacEng - 2025
Dissertations - FacEngSCE - 2025

Files in This Item:
File Description SizeFormat 
Mia Gauci.PDF
  Restricted Access
10.38 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.