Exploring a Reinforcement Learning Agent with Improved Prioritized Experience Replay for a Confrontation Game

被引:1
|
作者
Zhao, Tian [1 ]
机构
[1] City Univ Hong Kong, Dept Math, Shenzhen, Peoples R China
关键词
reinforcement learning; experience replay; Deep Qnetwork (DQN); Prioritized Experience Replay (PER); Hindsight Experience Replay (HER); Dynamic Hindsight Experience Replay (DHER); experience sharing;
D O I
10.1109/BDICN55575.2022.00075
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Q-network (DQN) is used successfully in dealing with many reinforcement learning situations and challenging tasks with real-world complexity. The current limits are the unacceptable training time to obtain satisfactory results like a human. To address this obstacle, I propose a new reinforcement learning strategy. This paper focuses on the confrontation game environment for two players with sparse reward and no direct hindsight reward function and no fixed goals. According to some strategies, algorithm can put them into reinforcement learning with reward functions and replay to give the abilities of judging in the middle of the games as references. To demonstrate the effectiveness of the proposed strategy, a new game is designed. Fence game is a confrontation game for two players that one tries their best to fence the other one in Die ?ow. The custom environment of this game will give the only reward functions at the end: win, lose or draw. In conclusion, these factors include performance and results proved that 1) Prioritized Experience Replay with Dynamic Hindsight reward function (DH-PER) and 2) Prioritized Experience Replay with Dynamic Hindsight reward function and Sharing (DHS-PER) both let the RL agents converge more quickly.
引用
收藏
页码:373 / 381
页数:9
相关论文
共 50 条
  • [1] Balanced prioritized experience replay in off-policy reinforcement learning
    Zhouwei Lou
    Yiye Wang
    Shuo Shan
    Kanjian Zhang
    Haikun Wei
    [J]. Neural Computing and Applications, 2024, 36 (25) : 15721 - 15737
  • [2] Memory Reduction through Experience Classification for Deep Reinforcement Learning with Prioritized Experience Replay
    Shen, Kai-Huan
    Tsai, Pei-Yun
    [J]. PROCEEDINGS OF THE 2019 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2019), 2019, : 166 - 171
  • [3] Batch process control based on reinforcement learning with segmented prioritized experience replay
    Xu, Chen
    Ma, Junwei
    Tao, Hongfeng
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (05)
  • [4] Multi-agent collaborative path planning algorithm with reinforcement learning and combined prioritized experience replay in Internet of Things
    Liu, Ping
    Ma, Xiangyu
    Ding, Jie
    Gu, Chenyu
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2024, 116
  • [5] High-Value Prioritized Experience Replay for Off-policy Reinforcement Learning
    Cao, Xi
    Wan, Huaiyu
    Lin, Youfang
    Han, Sheng
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1510 - 1514
  • [6] A Path Planning Method Based on Deep Reinforcement Learning with Improved Prioritized Experience Replay for Human-Robot Collaboration
    Sun, Deyu
    Wen, Jingqian
    Wang, Jingfei
    Yang, Xiaonan
    Hu, Yaoguang
    [J]. HUMAN-COMPUTER INTERACTION, PT II, HCI 2024, 2024, 14685 : 196 - 206
  • [7] Prioritized experience replay based deep distributional reinforcement learning for battery operation in microgrids
    Panda, Deepak Kumar
    Turner, Oliver
    Das, Saptarshi
    Abusara, Mohammad
    [J]. JOURNAL OF CLEANER PRODUCTION, 2024, 434
  • [8] Traffic navigation via reinforcement learning with episodic-guided prioritized experience replay
    Hassani, Hossein
    Nikan, Soodeh
    Shami, Abdallah
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
  • [9] Multi-Microgrid Energy Management Strategy Based on Multi-Agent Deep Reinforcement Learning with Prioritized Experience Replay
    Guo, Guodong
    Gong, Yanfeng
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (05):
  • [10] Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
    Foerster, Jakob
    Nardelli, Nantas
    Farquhar, Gregory
    Afouras, Triantafyllos
    Torr, Philip H. S.
    Kohli, Pushmeet
    Whiteson, Shimon
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70