Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion

被引:1
|
作者
Zhang, Bo-Kun [1 ]
Hu, Bin [1 ]
Chen, Long [1 ]
Zhang, Ding-Xue [2 ]
Cheng, Xin-Ming [3 ]
Guan, Zhi-Hong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[2] Yangtze Univ, Sch Petr Engn, Jingzhou 434023, Peoples R China
[3] Cent South Univ, Sch Automat, Changsha 430083, Peoples R China
来源
PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021) | 2021年
关键词
Reinforcement learning; Multi-agent; Pursuit-evasion; Probabilistic reward; SYSTEMS;
D O I
10.1109/CCDC52312.2021.9601771
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The reinforcement learning is studied to solve the problem of multi-agent pursuit and evasion games in this article. The main problem of current reinforcement learning for multi-agents is the low learning efficiency of agents. An important factor leading to this problem is that the delay of the Q function is related to the environment changing. To solve this problem, a probabilistic distribution reward value is used to replace the Q function in the multi-agent depth deterministic policy gradient framework (hereinafter referred to as MADDPG). The distribution Bellman equation is proved to be convergent, and can be brought into the framework of reinforcement learning algorithm. The probabilistic distribution reward value is updated in the algorithm, so that the reward value can be more adaptive to the complex environment. In the same time, eliminating the delay of rewards improves the efficiency of the strategy and obtains a better pursuit-evasion results. The final simulation and experiment show that the multi-agent algorithm with distribution rewards achieves better results under the setting environment.
引用
收藏
页码:3352 / 3357
页数:6
相关论文
共 50 条
  • [41] Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward
    Qu, Guannan
    Lin, Yiheng
    Wierman, Adam
    Li, Na
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [42] Reinforcement learning based on multi-agent in RoboCup
    Zhang, W
    Li, JG
    Ruan, XG
    ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 : 967 - 975
  • [43] Reward Function Design Method for Long Episode Pursuit Tasks Under Polar Coordinate in Multi-Agent Reinforcement Learning
    Dong Y.
    Cui T.
    Zhou Y.
    Song X.
    Zhu Y.
    Dong P.
    Journal of Shanghai Jiaotong University (Science), 2024, 29 (04) : 646 - 655
  • [44] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [45] Distributed multi-agent deep reinforcement learning for cooperative multi-robot pursuit
    Yu, Chao
    Dong, Yinzhao
    Li, Yangning
    Chen, Yatong
    JOURNAL OF ENGINEERING-JOE, 2020, 2020 (13): : 499 - 504
  • [46] Mobile User Interface Adaptation Based on Usability Reward Model and Multi-Agent Reinforcement Learning
    Vidmanov, Dmitry
    Alfimtsev, Alexander
    MULTIMODAL TECHNOLOGIES AND INTERACTION, 2024, 8 (04)
  • [47] An Improved Approach towards Multi-Agent Pursuit-Evasion Game Decision-Making Using Deep Reinforcement Learning
    Wan, Kaifang
    Wu, Dingwei
    Zhai, Yiwei
    Li, Bo
    Gao, Xiaoguang
    Hu, Zijian
    ENTROPY, 2021, 23 (11)
  • [48] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
    Duc Thien Nguyen
    Yeoh, William
    Lau, Hoong Chuin
    Zilberstein, Shlomo
    Zhang, Chongjie
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1447 - 1455
  • [49] Multi-Agent Deep Reinforcement Learning With Progressive Negative Reward for Cryptocurrency Trading
    Kumlungmak, Kittiwin
    Vateekul, Peerapon
    IEEE ACCESS, 2023, 11 : 66440 - 66455
  • [50] Leaders and Collaborators: Addressing Sparse Reward Challenges in Multi-Agent Reinforcement Learning
    Sun, Shaoqi
    Liu, Hui
    Xu, Kele
    Ding, Bo
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,