A New Reinforcement Learning Algorithm Based on Counterfactual Experience Replay

被引:0
|
作者
Li Menglin [1 ]
Chen Jing [1 ]
Chen Shaofei [1 ]
Gao Wei [1 ]
机构
[1] Natl Univ Def Technol, Changsha 410005, Peoples R China
关键词
Reinforcement Learning; Experience Replay Mechanism; Sampling Mechanism;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A new algorithm based on SARSA is proposed to avoid the overestimation problem in traditional reinforcement learning. Different from traditional methods to overcome this problem, the new algorithm can alleviate overestimation without significantly increasing the algorithm complexity. At the same time, aiming to problems existing in traditional SARSA, such as the weak ability of active exploration and unsatisfactory convergent results, the structure of Experience Memory Replay(EMR) is creatively modified in this paper. The new algorithm proposed in this paper changes the traditional experience playback structure and creatively adds counterfactual experience, which is called DCER(Dynamic Counterfactual Experience Replay) combining on-policy and off-policy. The exploration performance of the algorithm is increased by adding different experiences to EMR from the actual action when sampling. The algorithm was applied in the Gym Cartpole environment and compared with the traditional algorithm in the same environment, proving that the improved algorithm improved the performance of SARSA. Finally, the feasibility of the algorithm in a multi-agent reinforcement learning environment is analyzed.
引用
收藏
页码:1994 / 2001
页数:8
相关论文
共 50 条
  • [1] Tractable Reinforcement Learning for Signal Temporal Logic Tasks With Counterfactual Experience Replay
    Wang, Siqi
    Yin, Xunyuan
    Li, Shaoyuan
    Yin, Xiang
    IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 616 - 621
  • [2] Deep Reinforcement Learning with Experience Replay Based on SARSA
    Zhao, Dongbin
    Wang, Haitao
    Shao, Kun
    Zhu, Yuanheng
    PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
  • [3] Autonomous reinforcement learning with experience replay
    Wawrzynski, Pawel
    Tanwani, Ajay Kumar
    NEURAL NETWORKS, 2013, 41 : 156 - 167
  • [4] Associative Memory Based Experience Replay for Deep Reinforcement Learning
    Li, Mengyuan
    Kazemi, Arman
    Laguna, Ann Franchesca
    Hu, X. Sharon
    2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
  • [5] An Experience Replay Method Based on Tree Structure for Reinforcement Learning
    Jiang, Wei-Cheng
    Hwang, Kao-Shing
    Lin, Jin-Ling
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (02) : 972 - 982
  • [6] Trial and Error Experience Replay Based Deep Reinforcement Learning
    Zhang, Cheng
    Ma, Liang
    4TH IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2019) / 3RD INTERNATIONAL SYMPOSIUM ON REINFORCEMENT LEARNING (ISRL 2019), 2019, : 221 - 226
  • [7] SELECTIVE EXPERIENCE REPLAY IN REINFORCEMENT LEARNING FOR REIDENTIFICATION
    Thakoor, Ninad
    Bhanu, Bir
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 4250 - 4254
  • [8] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
    Zhang, Cheng
    Ma, Liang
    Schmitz, Alexander
    INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS, 2020, 4 (02) : 217 - 228
  • [9] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
    Cheng Zhang
    Liang Ma
    Alexander Schmitz
    International Journal of Intelligent Robotics and Applications, 2020, 4 : 217 - 228
  • [10] Deep Reinforcement Learning for Autonomous Driving based on Safety Experience Replay
    Huang X.
    Cheng Y.
    Yu Q.
    Wang X.
    IEEE Transactions on Cognitive and Developmental Systems, 2024, 16 (06) : 1 - 15