Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/138993| Title: | Combining off-policy and on-policy reinforcement learning for dynamic control of nonlinear systems |
| Authors: | Ahmed, Hani Hazza A. Fabri, Simon G. Bugeja, Marvin K. Camilleri, Kenneth P. |
| Keywords: | Reinforcement learning Machine learning Algorithms -- Mathematical models Nonlinear systems Python (Computer program language) |
| Issue Date: | 2025-10 |
| Publisher: | SCITEVENTS |
| Citation: | Ahmed, H. H.A., Fabri, S. G., Bugeja, M. K., & Camilleri, K. (2025, October). Combining off-policy and on-policy reinforcement learning for dynamic control of nonlinear systems. ICINCO 2025 - 22nd International Conference on Informatics in Control, Automation and Robotics, Marbella, Spain. 387-394. |
| Abstract: | This paper introduces QARSA, a novel reinforcement learning algorithm that combines the strengths of off-policy and on-policy methods, specifically Q-learning and SARSA, for the dynamic control of nonlinear systems. Designed to leverage the sample efficiency of off-policy learning while preserving the stability and lower variance of on-policy approaches, QARSA aims to offer a balanced and robust learning framework. The algorithm is evaluated on the CartPole-v1 simulation environment using the OpenAI Gym framework, with performance compared against standalone Q-learning and SARSA implementations. The comparison is based on three critical metrics: average reward, stability, and sample efficiency. Experimental results demonstrate that QARSA outperforms both Q-learning and SARSA, achieving higher average rewards, stability, sample efficiency, and improved consistency in learned policies. These results demonstrate QARSA’s effectiveness in environments were maximizing long-term performance while maintaining learning stability is crucial. The study provides valuable insights for the design of hybrid reinforcement learning algorithms for continuous control tasks. |
| URI: | https://www.um.edu.mt/library/oar/handle/123456789/138993 |
| Appears in Collections: | Scholarly Works - FacEngSCE |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| Combining off policy and on policy reinforcement learning for dynamic control of nonlinear systems 2025.pdf | 501.01 kB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
