A Deep Reinforcement Learning Approach Using Asymmetric Self-Play for Robust Multirobot Flocking

被引：0

作者：

Jia, Yunjie ^{[1
]}

Song, Yong ^{[1
]}

Cheng, Jiyu ^{[2
]}

Jin, Jiong ^{[3
]}

Zhang, Wei ^{[2
]}

Yang, Simon X. ^{[4
]}

Kwong, Sam ^{[5
]}

机构：

[1] Shandong Univ, Sch Mech Elect & Informat Engn, Shandong Key Lab Intelligent Elect Packaging Testi, Weihai 264209, Peoples R China

[2] Shandong Univ, Sch Control Sci & Engn, Jinan 250061, Peoples R China

[3] Swinburne Univ Technol, Sch Sci Comp & Engn Technol, Hawthorn, VIC 3122, Australia

[4] Univ Guelph, Adv Robot & Intelligent Syst Lab, Guelph, ON N1G 2W1, Canada

[5] Lingnan Univ, Sch Data Sci, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS | 2025年

基金：

中国国家自然科学基金;

关键词：

Robots; Adaptation models; Training; Collision avoidance; Navigation; Multi-robot systems; Uncertainty; Robot sensing systems; Robustness; Vehicle dynamics; Adversarial training; flocking; multiagent deep reinforcement learning (MADRL); autonomous vehicles; NONLINEAR MULTIAGENT SYSTEMS; OUTPUT REGULATION; ENHANCEMENT; UAVS;

D O I：

10.1109/TII.2024.3523576

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Flocking control, as an essential approach for survivable navigation of multirobot systems, has been widely applied in fields, such as logistics, service delivery, and search and rescue. However, realistic environments are typically complex, dynamic, and even aggressive, posing considerable threats to the safety of flocking robots. In this article, based on deep reinforcement learning, an Asymmetric Self-play-empowered Flocking Control framework is proposed to address this concern. Specifically, the flocking robots are trained concurrently with learnable adversarial interferers to stimulate the intelligence of the flocking strategy. A two-stage self-play training paradigm is developed to improve the robustness and generalization of the model. Furthermore, an auxiliary training module regarding the learning of transition dynamics is designed, dramatically enhancing the adaptability to environmental uncertainties. Feature-level and agent-level attention are implemented for action and value generation, respectively. Both extensive comparative experiments and real-world deployment demonstrate the superiority and practicality of the proposed framework.

引用

页数：10

共 50 条

[31] A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
Silver, David
Hubert, Thomas
Schrittwieser, Julian
Antonoglou, Ioannis
Lai, Matthew
Guez, Arthur
Lanctot, Marc
Sifre, Laurent
Kumaran, Dharshan
Graepel, Thore
Lillicrap, Timothy
Simonyan, Karen
Hassabis, Demis
SCIENCE, 2018, 362 (6419) : 1140 - +
[32] Learning a game strategy using pattern-weights and self-play
Shapiro, A
Fuchs, G
Levinson, R
COMPUTERS AND GAMES, 2003, 2883 : 42 - 60
[33] Deep Reinforcement Learning Approach for Flocking Control of Multi-agents
Zhang, Han
Cheng, Jin
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 5002 - 5007
[34] Research on the Difficulty of Mobile Node Deployment's Self-Play in Wireless Ad Hoc Networks Based on Deep Reinforcement Learning
Wang, Huitao
Yang, Ruopeng
Yin, Changsheng
Zou, Xiaofei
Wang, Xuefeng
WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
[35] Multiagent Reinforcement Learning for Strategic Decision Making and Control in Robotic Soccer Through Self-Play
Brandao, Bruno
De Lima, Telma Woerle
Soares, Anderson
Melo, Luckeciano
Maximo, Marcos R. O. A.
IEEE ACCESS, 2022, 10 : 72628 - 72642
[36] Transforming Cybersecurity Dynamics: Enhanced Self-Play Reinforcement Learning in Intrusion Detection and Prevention System
Jaber, Aws
18TH ANNUAL IEEE INTERNATIONAL SYSTEMS CONFERENCE, SYSCON 2024, 2024,
[37] Learning self-play agents for combinatorial optimization problems
Xu, Ruiyang
Lieberherr, Karl
KNOWLEDGE ENGINEERING REVIEW, 2020, 35
[38] Multirobot coordination with deep reinforcement learning in complex environments
Wang, Di
Deng, Hongbin
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 180
[39] Hierarchical reinforcement learning from competitive self-play for dual-aircraft formation air combat
Kong, Wei-ren
Zhou, De-yun
Zhou, Ying
Zhao, Yi-yang
JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING, 2023, 10 (02) : 830 - 859
[40] Autonomous air combat decision-making of UAV based on parallel self-play reinforcement learning
Li, Bo
Huang, Jingyi
Bai, Shuangxia
Gan, Zhigang
Liang, Shiyang
Evgeny, Neretin
Yao, Shouwen
CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023, 8 (01) : 64 - 81

← 1 2 3 4 5 →