A New Reinforcement Learning Algorithm Based on Counterfactual Experience Replay

被引:0
|
作者
Li Menglin [1 ]
Chen Jing [1 ]
Chen Shaofei [1 ]
Gao Wei [1 ]
机构
[1] Natl Univ Def Technol, Changsha 410005, Peoples R China
关键词
Reinforcement Learning; Experience Replay Mechanism; Sampling Mechanism;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A new algorithm based on SARSA is proposed to avoid the overestimation problem in traditional reinforcement learning. Different from traditional methods to overcome this problem, the new algorithm can alleviate overestimation without significantly increasing the algorithm complexity. At the same time, aiming to problems existing in traditional SARSA, such as the weak ability of active exploration and unsatisfactory convergent results, the structure of Experience Memory Replay(EMR) is creatively modified in this paper. The new algorithm proposed in this paper changes the traditional experience playback structure and creatively adds counterfactual experience, which is called DCER(Dynamic Counterfactual Experience Replay) combining on-policy and off-policy. The exploration performance of the algorithm is increased by adding different experiences to EMR from the actual action when sampling. The algorithm was applied in the Gym Cartpole environment and compared with the traditional algorithm in the same environment, proving that the improved algorithm improved the performance of SARSA. Finally, the feasibility of the algorithm in a multi-agent reinforcement learning environment is analyzed.
引用
收藏
页码:1994 / 2001
页数:8
相关论文
共 50 条
  • [31] Forgetful experience replay in hierarchical reinforcement learning from expert demonstrations
    Skrynnik, Alexey
    Staroverov, Aleksey
    Aitygulov, Ermek
    Aksenov, Kirill
    Davydov, Vasilii
    Panov, Aleksandr, I
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 218
  • [32] Regret Minimization Experience Replay in Off-Policy Reinforcement Learning
    Liu, Xu-Hui
    Xue, Zhenghai
    Pang, Jing-Cheng
    Jiang, Shengyi
    Xu, Feng
    Yu, Yang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [33] Robust experience replay sampling for multi-agent reinforcement learning
    Nicholaus, Isack Thomas
    Kang, Dae-Ki
    [J]. PATTERN RECOGNITION LETTERS, 2022, 155 : 135 - 142
  • [34] Counterfactual based reinforcement learning for graph neural networks
    Pham, David
    Zhang, Yongfeng
    [J]. ANNALS OF OPERATIONS RESEARCH, 2022,
  • [35] Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning
    Lin, Yijiong
    Huang, Jiancong
    Zimmer, Matthieu
    Guan, Yisheng
    Rojas, Juan
    Weng, Paul
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04): : 6615 - 6622
  • [36] Memory Reduction through Experience Classification for Deep Reinforcement Learning with Prioritized Experience Replay
    Shen, Kai-Huan
    Tsai, Pei-Yun
    [J]. PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2019), 2019, : 166 - 171
  • [37] Energy management strategy based on an improved TD3 reinforcement algorithm with novel experience replay
    Niu, Zegong
    Huang, Ruchen
    He, Hongwen
    Zhou, Zhiqiang
    Su, Qicong
    [J]. 2023 IEEE VEHICLE POWER AND PROPULSION CONFERENCE, VPPC, 2023,
  • [38] Double Broad Reinforcement Learning Based on Hindsight Experience Replay for Collision Avoidance of Unmanned Surface Vehicles
    Yu, Jiabao
    Chen, Jiawei
    Chen, Ying
    Zhou, Zhiguo
    Duan, Junwei
    [J]. JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (12)
  • [39] Map-based experience replay: a memory-efficient solution to catastrophic forgetting in reinforcement learning
    Hafez, Muhammad Burhan
    Immisch, Tilman
    Weber, Tom
    Wermter, Stefan
    [J]. FRONTIERS IN NEUROROBOTICS, 2023, 17
  • [40] Unveiling the Effects of Experience Replay on Deep Reinforcement Learning-based Power Allocation in Wireless Networks
    Kopic, Amna
    Perenda, Erma
    Gacanin, Haris
    [J]. 2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,