Date of Submission
Spring 2024
Supervisor
Dr. Ali Raza, Assistant Professor, Department of Computer Science
Co-Supervisor
Dr. Saurav Sthapit
Committee Member 1
Dr. Farrukh Hasan, Examiner – I, FAST National University
Committee Member 2
Dr. Salman Zafar, Examiner – II, Institute of Business Administration (IBA), Karachi
Degree
Master of Science in Data Science
Department
Department of Computer Science
Faculty/ School
School of Mathematics and Computer Science (SMCS)
Keywords
Reinforcement Learning, Reward Shaping, Neural Network, Policy Learning, Adversarial Attacks, and Data Poisoning
Abstract
This research enhances the robustness of reward-shaping-based reinforcement learning agents against adversarial attacks by investigating a critical vulnerability in the process of reward function generation and deploying a targeted defense mechanism to mitigate this weakness. Reinforcement learning agents are increasingly deployed in critical real-world scenarios where data integrity cannot be guaranteed, making their robustness against adversarial attacks essential for reliable performance. In reward shaping, a common and efficient approach is to learn a reward function from user feedback on sample data. However, this process is vulnerable to adversarial attacks, as ensuring the integrity of the feedback is challenging. Malicious actors can intentionally provide incorrect feedback to corrupt the learned policy. Existing research lacks a comprehensive understanding of the impact of such attacks, and the current methods for reward function design are not robust against data poisoning attacks.
In this work, we first explore how an effective attack mechanism can be designed by injecting noisy data into user feedback provided to the reinforcement learning agent. Secondly, we develop a defense mechanism based on the K-Nearest Neighbors (KNN) algorithm, which protects the reward function learning process from noisy data. Our experiments involved generating an Oracle agent that always provides correct feedback, simulating a perfect user. Subsequently, we systematically corrupted the feedback from the oracle to simulate an attack. The experiments covered scenarios both with and without the reward function and included varying levels of noise in the training data. The Mountain Car domain was used as a testbed.
The results demonstrated that the learned reward function significantly improved the agent's performance. However, as noise levels in the training data increased, the agent's performance degraded, highlighting the impact of data quality in efficient policy learning. Furthermore, the KNN-based defense mechanism detected noisy data with high accuracy across different noise levels, as indicated by a consistently low number of noisy data points predicted as clean. Our findings underscore the importance of analyzing potential vulnerabilities in the reinforcement learning process. Moreover, a straightforward
technique like KNN can effectively detect and mitigate noisy data, further improving the system's robustness.
Document Type
Restricted Access
Submission Type
Thesis
Recommended Citation
Zia Uddin, A. (2024). Resilient Reinforcement Learning with Reward Shaping (Unpublished Unpublished graduate thesis). Retrieved from https://ir.iba.edu.pk/etd-ms-ds/6