A New Reinforcement Learning Algorithm Based on Counterfactual Experience Replay

被引:0
|
作者
Li Menglin [1 ]
Chen Jing [1 ]
Chen Shaofei [1 ]
Gao Wei [1 ]
机构
[1] Natl Univ Def Technol, Changsha 410005, Peoples R China
关键词
Reinforcement Learning; Experience Replay Mechanism; Sampling Mechanism;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A new algorithm based on SARSA is proposed to avoid the overestimation problem in traditional reinforcement learning. Different from traditional methods to overcome this problem, the new algorithm can alleviate overestimation without significantly increasing the algorithm complexity. At the same time, aiming to problems existing in traditional SARSA, such as the weak ability of active exploration and unsatisfactory convergent results, the structure of Experience Memory Replay(EMR) is creatively modified in this paper. The new algorithm proposed in this paper changes the traditional experience playback structure and creatively adds counterfactual experience, which is called DCER(Dynamic Counterfactual Experience Replay) combining on-policy and off-policy. The exploration performance of the algorithm is increased by adding different experiences to EMR from the actual action when sampling. The algorithm was applied in the Gym Cartpole environment and compared with the traditional algorithm in the same environment, proving that the improved algorithm improved the performance of SARSA. Finally, the feasibility of the algorithm in a multi-agent reinforcement learning environment is analyzed.
引用
收藏
页码:1994 / 2001
页数:8
相关论文
共 50 条
  • [1] Tractable Reinforcement Learning for Signal Temporal Logic Tasks With Counterfactual Experience Replay
    Wang, Siqi
    Yin, Xunyuan
    Li, Shaoyuan
    Yin, Xiang
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 616 - 621
  • [2] Deep Reinforcement Learning with Experience Replay Based on SARSA
    Zhao, Dongbin
    Wang, Haitao
    Shao, Kun
    Zhu, Yuanheng
    [J]. PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [3] Autonomous reinforcement learning with experience replay
    Wawrzynski, Pawel
    Tanwani, Ajay Kumar
    [J]. NEURAL NETWORKS, 2013, 41 : 156 - 167
  • [4] Trial and Error Experience Replay Based Deep Reinforcement Learning
    Zhang, Cheng
    Ma, Liang
    [J]. 4TH IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2019) / 3RD INTERNATIONAL SYMPOSIUM ON REINFORCEMENT LEARNING (ISRL 2019), 2019, : 221 - 226
  • [5] An Experience Replay Method Based on Tree Structure for Reinforcement Learning
    Jiang, Wei-Cheng
    Hwang, Kao-Shing
    Lin, Jin-Ling
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (02) : 972 - 982
  • [6] Associative Memory Based Experience Replay for Deep Reinforcement Learning
    Li, Mengyuan
    Kazemi, Arman
    Laguna, Ann Franchesca
    Hu, X. Sharon
    [J]. 2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
  • [7] SELECTIVE EXPERIENCE REPLAY IN REINFORCEMENT LEARNING FOR REIDENTIFICATION
    Thakoor, Ninad
    Bhanu, Bir
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 4250 - 4254
  • [8] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
    Zhang, Cheng
    Ma, Liang
    Schmitz, Alexander
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS, 2020, 4 (02) : 217 - 228
  • [9] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
    Cheng Zhang
    Liang Ma
    Alexander Schmitz
    [J]. International Journal of Intelligent Robotics and Applications, 2020, 4 : 217 - 228
  • [10] DERLight: A Deep Reinforcement Learning Traffic Light Control Algorithm with Dual Experience Replay
    Yang, Zhichao
    Kong, Yan
    Hsia, Chih-Hsien
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2024, 25 (01): : 79 - 86