Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion

被引:1
|
作者
Zhang, Bo-Kun [1 ]
Hu, Bin [1 ]
Chen, Long [1 ]
Zhang, Ding-Xue [2 ]
Cheng, Xin-Ming [3 ]
Guan, Zhi-Hong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[2] Yangtze Univ, Sch Petr Engn, Jingzhou 434023, Peoples R China
[3] Cent South Univ, Sch Automat, Changsha 430083, Peoples R China
关键词
Reinforcement learning; Multi-agent; Pursuit-evasion; Probabilistic reward; SYSTEMS;
D O I
10.1109/CCDC52312.2021.9601771
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The reinforcement learning is studied to solve the problem of multi-agent pursuit and evasion games in this article. The main problem of current reinforcement learning for multi-agents is the low learning efficiency of agents. An important factor leading to this problem is that the delay of the Q function is related to the environment changing. To solve this problem, a probabilistic distribution reward value is used to replace the Q function in the multi-agent depth deterministic policy gradient framework (hereinafter referred to as MADDPG). The distribution Bellman equation is proved to be convergent, and can be brought into the framework of reinforcement learning algorithm. The probabilistic distribution reward value is updated in the algorithm, so that the reward value can be more adaptive to the complex environment. In the same time, eliminating the delay of rewards improves the efficiency of the strategy and obtains a better pursuit-evasion results. The final simulation and experiment show that the multi-agent algorithm with distribution rewards achieves better results under the setting environment.
引用
收藏
页码:3352 / 3357
页数:6
相关论文
共 50 条
  • [31] Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning
    Zhang, Tianle
    Liu, Zhen
    Wu, Shiguang
    Pu, Zhiqiang
    Yi, Jianqiang
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [32] Reward design for driver repositioning using multi-agent reinforcement learning
    Shou, Zhenyu
    Di, Xuan
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2020, 119
  • [33] Reinforcement learning based on multi-agent in RoboCup
    Zhang, W
    Li, JG
    Ruan, XG
    [J]. ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 : 967 - 975
  • [34] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    [J]. 2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [35] Distributed multi-agent deep reinforcement learning for cooperative multi-robot pursuit
    Yu, Chao
    Dong, Yinzhao
    Li, Yangning
    Chen, Yatong
    [J]. JOURNAL OF ENGINEERING-JOE, 2020, 2020 (13): : 499 - 504
  • [36] Mobile User Interface Adaptation Based on Usability Reward Model and Multi-Agent Reinforcement Learning
    Vidmanov, Dmitry
    Alfimtsev, Alexander
    [J]. MULTIMODAL TECHNOLOGIES AND INTERACTION, 2024, 8 (04)
  • [37] An Improved Approach towards Multi-Agent Pursuit-Evasion Game Decision-Making Using Deep Reinforcement Learning
    Wan, Kaifang
    Wu, Dingwei
    Zhai, Yiwei
    Li, Bo
    Gao, Xiaoguang
    Hu, Zijian
    [J]. ENTROPY, 2021, 23 (11)
  • [38] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
    Duc Thien Nguyen
    Yeoh, William
    Lau, Hoong Chuin
    Zilberstein, Shlomo
    Zhang, Chongjie
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1447 - 1455
  • [39] Multi-Agent Deep Reinforcement Learning With Progressive Negative Reward for Cryptocurrency Trading
    Kumlungmak, Kittiwin
    Vateekul, Peerapon
    [J]. IEEE ACCESS, 2023, 11 : 66440 - 66455
  • [40] Role differentiation process by division of reward function in multi-agent reinforcement learning
    Taniguchi, Tadahiro
    Tabuchi, Kazuma
    Sawaragi, Tetsuo
    [J]. 2008 PROCEEDINGS OF SICE ANNUAL CONFERENCE, VOLS 1-7, 2008, : 358 - +