Date of Submission
Spring 2024
Supervisor
Dr. Syed Ali Raza, Assistant Professor, Department of Computer Science
Committee Member 1
Dr. Tariq Mahmood, Examiner – I, Institute of Business Administration (IBA), Karachi, Institute of Business Administration (IBA), Karachi
Committee Member 2
Dr. Sajjad Haider, Examiner – II, Institute of Business Administration (IBA), Karachi, Institute of Business Administration (IBA), Karachi
Degree
Master of Science in Data Science
Department
Department of Computer Science
Faculty/ School
School of Mathematics and Computer Science (SMCS)
Keywords
Deep Reinforcement Learning, Neural Networks, Stock Trading, Automated Stock Trading, Finance
Abstract
The equity market is characterized by its inherent volatility and unpredictability, yet it is governed by underlying patterns and structures. The most successful traders are ones who have managed to accumulate small wins over a period, rather than investing in a single stock that rose tremendously. However, in today’s fast-moving economy, it is more difficult for us to place our bets on a 20+ year investment horizon, and it would be more beneficial to the investors to have multiple wins during that period.
Through this study our aim is to leverage algorithms that use various methods of Deep Reinforcement Learning to identify the dynamics of stock movement and attempt to optimize them by altering different parameters with the hope of improving their performance and understanding the sensitivity of each model’s performance to those parameters. We aim to train our model in a manner wherein it learns a medium-term strategy rather than keeping an investment horizon of multiple decades.
To achieve this, we used the stock data from the Dow Jones 30 Index to train our model, using each stock’s closing price and a handful of technical indicators. The data was then used to train our Agent through different models, such as the Advantage Actor-Critic (A2C), the Proximal Policy Optimization (PPO), the Deep Deterministic Policy Gradient (DDPG), the Soft Actor-Critic (SAC), and the Twin Delay Deep Deterministic Policy Gradient (TD3) models, whose training was curtailed to a finite period for each episode. Our objective is to identify the optimal episodic length to achieve the best results.
Through our experiments, we managed to chart the performance of various trained models against the different parameters used to train them. We noticed that clipping the investment horizon to a shorter period resulted in more waning results, whereas longer training episodes led to far more stable and positive results.
Document Type
Restricted Access
Submission Type
Thesis
Recommended Citation
Khan, O. (2024). Deep Reinforcement Learning Applications in Stock Trading (Unpublished Unpublished graduate thesis). Retrieved from https://ir.iba.edu.pk/etd-ms-ds/7