A New Reinforcement Learning Algorithm Based on Counterfactual Experience Replay

被引：0

作者：

Li Menglin ^{[1
]}

Chen Jing ^{[1
]}

Chen Shaofei ^{[1
]}

Gao Wei ^{[1
]}

机构：

[1] Natl Univ Def Technol, Changsha 410005, Peoples R China

来源：

PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE | 2020年

关键词：

Reinforcement Learning; Experience Replay Mechanism; Sampling Mechanism;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A new algorithm based on SARSA is proposed to avoid the overestimation problem in traditional reinforcement learning. Different from traditional methods to overcome this problem, the new algorithm can alleviate overestimation without significantly increasing the algorithm complexity. At the same time, aiming to problems existing in traditional SARSA, such as the weak ability of active exploration and unsatisfactory convergent results, the structure of Experience Memory Replay(EMR) is creatively modified in this paper. The new algorithm proposed in this paper changes the traditional experience playback structure and creatively adds counterfactual experience, which is called DCER(Dynamic Counterfactual Experience Replay) combining on-policy and off-policy. The exploration performance of the algorithm is increased by adding different experiences to EMR from the actual action when sampling. The algorithm was applied in the Gym Cartpole environment and compared with the traditional algorithm in the same environment, proving that the improved algorithm improved the performance of SARSA. Finally, the feasibility of the algorithm in a multi-agent reinforcement learning environment is analyzed.

引用

页码：1994 / 2001

页数：8

共 50 条

[1] Tractable Reinforcement Learning for Signal Temporal Logic Tasks With Counterfactual Experience Replay
Wang, Siqi
Yin, Xunyuan
Li, Shaoyuan
Yin, Xiang
IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 616 - 621
[2] Deep Reinforcement Learning with Experience Replay Based on SARSA
Zhao, Dongbin
Wang, Haitao
Shao, Kun
Zhu, Yuanheng
PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
[3] Autonomous reinforcement learning with experience replay
Wawrzynski, Pawel
Tanwani, Ajay Kumar
NEURAL NETWORKS, 2013, 41 : 156 - 167
[4] Associative Memory Based Experience Replay for Deep Reinforcement Learning
Li, Mengyuan
Kazemi, Arman
Laguna, Ann Franchesca
Hu, X. Sharon
2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
[5] An Experience Replay Method Based on Tree Structure for Reinforcement Learning
Jiang, Wei-Cheng
Hwang, Kao-Shing
Lin, Jin-Ling
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (02) : 972 - 982
[6] Trial and Error Experience Replay Based Deep Reinforcement Learning
Zhang, Cheng
Ma, Liang
4TH IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2019) / 3RD INTERNATIONAL SYMPOSIUM ON REINFORCEMENT LEARNING (ISRL 2019), 2019, : 221 - 226
[7] SELECTIVE EXPERIENCE REPLAY IN REINFORCEMENT LEARNING FOR REIDENTIFICATION
Thakoor, Ninad
Bhanu, Bir
2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 4250 - 4254
[8] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
Zhang, Cheng
Ma, Liang
Schmitz, Alexander
INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS, 2020, 4 (02) : 217 - 228
[9] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
Cheng Zhang
Liang Ma
Alexander Schmitz
International Journal of Intelligent Robotics and Applications, 2020, 4 : 217 - 228
[10] Deep Reinforcement Learning for Autonomous Driving based on Safety Experience Replay
Huang X.
Cheng Y.
Yu Q.
Wang X.
IEEE Transactions on Cognitive and Developmental Systems, 2024, 16 (06) : 1 - 15

← 1 2 3 4 5 →