Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion

被引:1
|
作者
Zhang, Bo-Kun [1 ]
Hu, Bin [1 ]
Chen, Long [1 ]
Zhang, Ding-Xue [2 ]
Cheng, Xin-Ming [3 ]
Guan, Zhi-Hong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[2] Yangtze Univ, Sch Petr Engn, Jingzhou 434023, Peoples R China
[3] Cent South Univ, Sch Automat, Changsha 430083, Peoples R China
关键词
Reinforcement learning; Multi-agent; Pursuit-evasion; Probabilistic reward; SYSTEMS;
D O I
10.1109/CCDC52312.2021.9601771
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The reinforcement learning is studied to solve the problem of multi-agent pursuit and evasion games in this article. The main problem of current reinforcement learning for multi-agents is the low learning efficiency of agents. An important factor leading to this problem is that the delay of the Q function is related to the environment changing. To solve this problem, a probabilistic distribution reward value is used to replace the Q function in the multi-agent depth deterministic policy gradient framework (hereinafter referred to as MADDPG). The distribution Bellman equation is proved to be convergent, and can be brought into the framework of reinforcement learning algorithm. The probabilistic distribution reward value is updated in the algorithm, so that the reward value can be more adaptive to the complex environment. In the same time, eliminating the delay of rewards improves the efficiency of the strategy and obtains a better pursuit-evasion results. The final simulation and experiment show that the multi-agent algorithm with distribution rewards achieves better results under the setting environment.
引用
收藏
页码:3352 / 3357
页数:6
相关论文
共 50 条
  • [1] Multi-agent pursuit and evasion games based on improved reinforcement learning
    Xue, Ya-Li
    Ye, Jin-Ze
    Li, Han-Yan
    [J]. Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (08): : 1479 - 1486
  • [2] Pursuit-Evasion Games for Multi-agent Based on Reinforcement Learning with Obstacles
    Hu, Penglin
    Guo, Yaning
    Hu, Jinwen
    Pan, Quan
    [J]. PROCEEDINGS OF 2022 INTERNATIONAL CONFERENCE ON AUTONOMOUS UNMANNED SYSTEMS, ICAUS 2022, 2023, 1010 : 1015 - 1024
  • [3] Pursuit and evasion game between UVAs based on multi-agent reinforcement learning
    Xu, Guangyan
    Zhao, Yang
    Liu, Hao
    [J]. 2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1261 - 1266
  • [4] An Approach to Multi-Agent Pursuit Evasion Games Using Reinforcement Learning
    Bilgin, Ahmet Tunc
    Kadioglu-Urtis, Esra
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS (ICAR), 2015, : 164 - 169
  • [5] Multi-agent Reward-Based Intruder Capture
    Grimaldi, Michele
    Herpson, Cedric
    [J]. INTELLIGENT DISTRIBUTED COMPUTING XVI, IDC 2023, 2024, 1138 : 251 - 266
  • [6] Reward-based epigenetic learning algorithm for a decentralised multi-agent system
    Mukhlish, Faqihza
    Page, John
    Bain, Michael
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT UNMANNED SYSTEMS, 2020, 8 (03) : 201 - 224
  • [7] Multi-Agent Reinforcement Learning with Reward Delays
    Zhang, Yuyang
    Zhang, Runyu
    Gu, Yuantao
    Li, Na
    [J]. LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [8] MuDE: Multi-agent decomposed reward-based exploration
    Yoo, Byunghyun
    Yi, Sungwon
    Kim, Hyunwoo
    Shin, Younghwan
    Han, Ran
    Seo, Seungwoo
    Song, Hwa Jeon
    Chung, Euisok
    Yang, Jeongmin
    [J]. NEURAL NETWORKS, 2024, 179
  • [9] Direct reward and indirect reward in multi-agent reinforcement learning
    Ohta, M
    [J]. ROBOCUP 2002: ROBOT SOCCER WORLD CUP VI, 2003, 2752 : 359 - 366
  • [10] Plan-based reward shaping for multi-agent reinforcement learning
    Devlin, Sam
    Kudenko, Daniel
    [J]. KNOWLEDGE ENGINEERING REVIEW, 2016, 31 (01): : 44 - 58