A Deep Reinforcement Learning Approach Using Asymmetric Self-Play for Robust Multirobot Flocking

被引:0
|
作者
Jia, Yunjie [1 ]
Song, Yong [1 ]
Cheng, Jiyu [2 ]
Jin, Jiong [3 ]
Zhang, Wei [2 ]
Yang, Simon X. [4 ]
Kwong, Sam [5 ]
机构
[1] Shandong Univ, Sch Mech Elect & Informat Engn, Shandong Key Lab Intelligent Elect Packaging Testi, Weihai 264209, Peoples R China
[2] Shandong Univ, Sch Control Sci & Engn, Jinan 250061, Peoples R China
[3] Swinburne Univ Technol, Sch Sci Comp & Engn Technol, Hawthorn, VIC 3122, Australia
[4] Univ Guelph, Adv Robot & Intelligent Syst Lab, Guelph, ON N1G 2W1, Canada
[5] Lingnan Univ, Sch Data Sci, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Robots; Adaptation models; Training; Collision avoidance; Navigation; Multi-robot systems; Uncertainty; Robot sensing systems; Robustness; Vehicle dynamics; Adversarial training; flocking; multiagent deep reinforcement learning (MADRL); autonomous vehicles; NONLINEAR MULTIAGENT SYSTEMS; OUTPUT REGULATION; ENHANCEMENT; UAVS;
D O I
10.1109/TII.2024.3523576
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Flocking control, as an essential approach for survivable navigation of multirobot systems, has been widely applied in fields, such as logistics, service delivery, and search and rescue. However, realistic environments are typically complex, dynamic, and even aggressive, posing considerable threats to the safety of flocking robots. In this article, based on deep reinforcement learning, an Asymmetric Self-play-empowered Flocking Control framework is proposed to address this concern. Specifically, the flocking robots are trained concurrently with learnable adversarial interferers to stimulate the intelligence of the flocking strategy. A two-stage self-play training paradigm is developed to improve the robustness and generalization of the model. Furthermore, an auxiliary training module regarding the learning of transition dynamics is designed, dramatically enhancing the adaptability to environmental uncertainties. Feature-level and agent-level attention are implemented for action and value generation, respectively. Both extensive comparative experiments and real-world deployment demonstrate the superiority and practicality of the proposed framework.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] A Sharp Analysis of Model-based Reinforcement Learning with Self-Play
    Liu, Qinghua
    Yu, Tiancheng
    Bai, Yu
    Jin, Chi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [22] Zwei: A Self-Play Reinforcement Learning Framework for Video Transmission Services
    Huang, Tianchi
    Zhang, Rui-Xiao
    Sun, Lifeng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1350 - 1365
  • [23] A deep reinforcement learning method for structural dominant failure modes searching based on self-play strategy
    Guan, Xiaoshu
    Sun, Huabin
    Hou, Rongrong
    Xu, Yang
    Bao, Yuequan
    Li, Hui
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2023, 233
  • [24] Air combat intelligent decision-making method based on self-play and deep reinforcement learning
    Shan, Shengzhe
    Zhang, Weiwei
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2024, 45 (04):
  • [25] Reinforcement Learning in the Game of Othello: Learning Against a Fixed Opponent and Learning from Self-Play
    van der Ree, Michiel
    Wiering, Marco
    PROCEEDINGS OF THE 2013 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2013, : 108 - 115
  • [26] Self-Organised Swarm Flocking with Deep Reinforcement Learning
    Bezcioglu, Mehmet B.
    Lennox, Barry
    Arvin, Farshad
    2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 226 - 230
  • [27] Flocking Control of UAV Swarms with Deep Reinforcement Learning Approach
    Yan, Peng
    Bai, Chengchao
    Zheng, Hongxing
    Guo, Jifeng
    PROCEEDINGS OF 2020 3RD INTERNATIONAL CONFERENCE ON UNMANNED SYSTEMS (ICUS), 2020, : 592 - 599
  • [28] Abalearn: A risk-sensitive approach to self-play learning in abalone
    Campos, P
    Langlois, T
    MACHINE LEARNING: ECML 2003, 2003, 2837 : 35 - 46
  • [29] Advancing Air Combat Tactics with Improved Neural Fictitious Self-play Reinforcement Learning
    He, Shaoqin
    Gao, Yang
    Zhang, Baofeng
    Chang, Hui
    Zhang, Xinchen
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT V, 2023, 14090 : 653 - 666
  • [30] Learning Algorithms with Self-Play: A New Approach to the Distributed Directory Problem
    Khanchandani, Pankaj
    Richter, Oliver
    Rusch, Lukas
    Wattenhofer, Roger
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 501 - 505