Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/141862| Title: | Language model guided reinforcement learning in quantitative trading : LLM-guided intelligence bridging strategy and safety in trading |
| Authors: | Darmanin, Adam (2025) |
| Keywords: | Reinforcement learning -- Malta Machine learning Natural language generation (Computer science) -- Malta Artificial intelligence -- Malta Finance -- Malta |
| Issue Date: | 2025 |
| Citation: | Darmanin, A. (2025). Language model guided reinforcement learning in quantitative trading: LLM-guided intelligence bridging strategy and safety in trading (Master's dissertation). |
| Abstract: | This research explores the application of LLMs in guiding Reinforcement Learning (RL) algorithms to address key challenges in algorithmic trading. While RL agents are effective at optimizing actions based on reward signals, they often exhibit myopic behavior, lacking the strategic foresight and economic intuition needed to operate in a complex environment such as the financial markets. To address this limitation, LLMs are introduced as strategic planners capable of synthesizing high-level trading strategies from heterogeneous sources, including market data, macroeconomic indicators, and news sentiment. By informing the policy layer of RL agents, LLMs enable the generation of trading strategies that are both risk-aware and sensitive to the prevailing market conditions. Unlike traditional approaches that retrain or control the RL agent per scenario, the LLM acts as a guidance mechanism that adapts its outputs to align with predefined high-risk and low-risk investor profiles, thereby enabling the same underlying RL agent to operate effectively across distinct risk preferences. We propose a novel framework to evaluate the potential of LLMs guidance for RL agents. The research has two objectives: (i) to determine whether LLMs can reliably produce market-aware strategies that meet the standards of professional trading systems, as validated through Human-in-the-Loop (HITL) expert surveys; and (ii) to assess whether LLMs can guide a single RL agent to improve its trading performance, measured with the Sharpe Ratio (SR) in a high-risk setting, and en- hance its risk management, measured through the Maximum Drawdown (MDD) in a low-risk setting, without retraining or modifying the agent itself. To achieve these objectives, LLMs generate high-level strategies using prompt templates iteratively refined through expert-provided trades and validated through back-testing using the metric SR. The performance of the LLM-guided Deep Reinforcement Learning (DRL) agent is benchmarked against a traditional RL model using a subset of securities within the technology sector. The final RL agent’s effectiveness is assessed using standard portfolio metrics, including SR and MDD. Empirical results demonstrate that LLM-enhanced RL agents can achieve superior SR and MDD than their benchmarks, with greater safety when compared to black-box deep learning counterparts. This research contributes to the fields of algorithmic trading and Artificial Intelligence (AI) by demonstrating the synergistic application of LLMs to RL, highlighting their ability to adapt a single learning agent to diverse investor risk profiles and enhance performance and safety in complex decision-making environments. |
| Description: | M.Sc.(Melit.) |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/141862 |
| Appears in Collections: | Dissertations - FacICT - 2025 Dissertations - FacICTAI - 2025 |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 2519ICTICS520005013774_1.PDF | 5.28 MB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
