A New Reinforcement Learning Algorithm Based on Counterfactual Experience Replay

被引:0
|
作者
Li Menglin [1 ]
Chen Jing [1 ]
Chen Shaofei [1 ]
Gao Wei [1 ]
机构
[1] Natl Univ Def Technol, Changsha 410005, Peoples R China
关键词
Reinforcement Learning; Experience Replay Mechanism; Sampling Mechanism;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A new algorithm based on SARSA is proposed to avoid the overestimation problem in traditional reinforcement learning. Different from traditional methods to overcome this problem, the new algorithm can alleviate overestimation without significantly increasing the algorithm complexity. At the same time, aiming to problems existing in traditional SARSA, such as the weak ability of active exploration and unsatisfactory convergent results, the structure of Experience Memory Replay(EMR) is creatively modified in this paper. The new algorithm proposed in this paper changes the traditional experience playback structure and creatively adds counterfactual experience, which is called DCER(Dynamic Counterfactual Experience Replay) combining on-policy and off-policy. The exploration performance of the algorithm is increased by adding different experiences to EMR from the actual action when sampling. The algorithm was applied in the Gym Cartpole environment and compared with the traditional algorithm in the same environment, proving that the improved algorithm improved the performance of SARSA. Finally, the feasibility of the algorithm in a multi-agent reinforcement learning environment is analyzed.
引用
收藏
页码:1994 / 2001
页数:8
相关论文
共 50 条
  • [21] Prioritized experience replay based deep distributional reinforcement learning for battery operation in microgrids
    Panda, Deepak Kumar
    Turner, Oliver
    Das, Saptarshi
    Abusara, Mohammad
    JOURNAL OF CLEANER PRODUCTION, 2024, 434
  • [22] Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay
    Yin, Haiyan
    Pan, Sinno Jialin
    THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1640 - 1646
  • [23] Proxy Experience Replay: Federated Distillation for Distributed Reinforcement Learning
    Cha, Han
    Park, Jihong
    Kim, Hyesung
    Bennis, Mehdi
    Kim, Seong-Lyun
    IEEE INTELLIGENT SYSTEMS, 2020, 35 (04) : 94 - 101
  • [24] Counterfactual-Based Action Evaluation Algorithm in Multi-Agent Reinforcement Learning
    Yuan, Yuyu
    Zhao, Pengqian
    Guo, Ting
    Jiang, Hongpu
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [25] Prioritized experience replay based reinforcement learning for adaptive tracking control of autonomous underwater vehicle
    Li, Ting
    Yang, Dongsheng
    Xie, Xiangpeng
    APPLIED MATHEMATICS AND COMPUTATION, 2023, 443
  • [26] Intrusion Detection Based on Adaptive Sample Distribution Dual-Experience Replay Reinforcement Learning
    Tan, Haonan
    Wang, Le
    Zhu, Dong
    Deng, Jianyu
    MATHEMATICS, 2024, 12 (07)
  • [27] Composite Experience Replay-Based Deep Reinforcement Learning With Application in Wind Farm Control
    Dong, Hongyang
    Zhao, Xiaowei
    IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2022, 30 (03) : 1281 - 1295
  • [28] Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
    Foerster, Jakob
    Nardelli, Nantas
    Farquhar, Gregory
    Afouras, Triantafyllos
    Torr, Philip H. S.
    Kohli, Pushmeet
    Whiteson, Shimon
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [29] Balanced prioritized experience replay in off-policy reinforcement learning
    Lou Z.
    Wang Y.
    Shan S.
    Zhang K.
    Wei H.
    Neural Computing and Applications, 2024, 36 (25) : 15721 - 15737
  • [30] Enhanced Off-Policy Reinforcement Learning With Focused Experience Replay
    Kong, Seung-Hyun
    Nahrendra, I. Made Aswin
    Paek, Dong-Hee
    IEEE ACCESS, 2021, 9 (09): : 93152 - 93164