Independent Deep Deterministic Policy Gradient Reinforcement Learning in Cooperative Multiagent Pursuit Games

被引:1
|
作者
Zhou, Shiyang [1 ,2 ]
Ren, Weiya [1 ,2 ]
Ren, Xiaoguang [1 ,2 ]
Wang, Yanzhen [1 ,2 ]
Yi, Xiaodong [1 ,2 ]
机构
[1] Def Innovat Inst, Artificial Intelligence Res Ctr, Beijing 100072, Peoples R China
[2] Tianjin Artificial Intelligence Innovat Ctr, Tianjin 300457, Peoples R China
关键词
Reinforcement learning; Actor-critic; Potential field; Planning and learning; Predator-prey;
D O I
10.1007/978-3-030-86380-7_51
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we study a fully decentralized multi-agent pursuit problem in a non-communication environment. Fully decentralized (decentralized training and decentralized execution) has stronger robustness and scalability compared with centralized training and decentralized execution (CTDE), which is the current popular multi-agent reinforcement learning method. Both centralized training and communication mechanism require a large amount of information exchange between agents, which are strong assumptions that are difficult to meet in reality. However, traditional fully decentralized multi-agent reinforcement learning methods (e.g., IQL) are difficult to converge stably due to the dynamic changes of other agents' strategies. Therefore, we extend actor-critic to actor-critic-N framework, and propose Potential-Field-Guided Deep Deterministic Policy Gradient (PGDDPG) method on this basis. The agent uses the unified artificial potential field to guide the agent's strategy updating, which reduces the uncertainty of multi-agent's decision making in the complex and dynamic changing environment. Thus, PGDDPG which we proposed can converge fast and stably. Finally, through the pursuit experiments in MPE and CARLA, we prove that our method achieves higher success rate and more stable performance than DDPG and MADDPG.
引用
收藏
页码:625 / 637
页数:13
相关论文
共 50 条
  • [1] Semicentralized Deep Deterministic Policy Gradient in Cooperative StarCraft Games
    Xie, Dong
    Zhong, Xiangnan
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) : 1584 - 1593
  • [2] A Deep Reinforcement Learning Method based on Deterministic Policy Gradient for Multi-Agent Cooperative Competition
    Zuo, Xuan
    Xue, Hui-Feng
    Wang, Xiao-Yin
    Du, Wan-Ru
    Tian, Tao
    Gao, Shan
    Zhang, Pu
    [J]. CONTROL ENGINEERING AND APPLIED INFORMATICS, 2021, 23 (03): : 88 - 98
  • [3] Deep Ensemble Reinforcement Learning with Multiple Deep Deterministic Policy Gradient Algorithm
    Wu, Junta
    Li, Huiyun
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [4] Peer Incentive Reinforcement Learning for Cooperative Multiagent Games
    Zhang, Tianle
    Liu, Zhen
    Pu, Zhiqiang
    Yi, Jianqiang
    [J]. IEEE TRANSACTIONS ON GAMES, 2023, 15 (04) : 623 - 636
  • [5] Multiagent Cooperative Learning Strategies for Pursuit-Evasion Games
    Kuo, Jong Yih
    Yu, Hsiang-Fu
    Liu, Kevin Fong-Rey
    Lee, Fang-Wen
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
  • [6] Generative Adversarial Inverse Reinforcement Learning With Deep Deterministic Policy Gradient
    Zhan, Ming
    Fan, Jingjing
    Guo, Jianying
    [J]. IEEE ACCESS, 2023, 11 : 87732 - 87746
  • [7] Cooperative Multiagent Deep Deterministic Policy Gradient (CoMADDPG) for Intelligent Connected Transportation with Unsignalized Intersection
    Wu, Tianhao
    Jiang, Mingzhi
    Zhang, Lin
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [8] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
    Kim, Dong-Ki
    Liu, Miao
    Riemer, Matthew
    Sun, Chuangchuang
    Abdulhai, Marwa
    Habibi, Golnaz
    Lopez-Cot, Sebastian
    Tesauro, Gerald
    How, Jonathan P.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [9] Strategy Generation Based on Reinforcement Learning with Deep Deterministic Policy Gradient for UCAV
    Ma, Yunhong
    Bai, Shuyao
    Zhao, Yifei
    Song, Chao
    Yang, Jie
    [J]. 16TH IEEE INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2020), 2020, : 789 - 794
  • [10] Reinforcement Learning for Mobile Robot Obstacle Avoidance with Deep Deterministic Policy Gradient
    Chen, Miao
    Li, Wenna
    Fei, Shihan
    Wei, Yufei
    Tu, Mingyang
    Li, Jiangbo
    [J]. INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT III, 2022, 13457 : 197 - 204