共 21 条
Asymmetric Self-Play-Enabled Intelligent Heterogeneous Multirobot Catching System Using Deep Multiagent Reinforcement Learning
被引:12
|作者:
Gao, Yuan
[1
]
Chen, Junfeng
[1
]
Chen, Xi
[2
]
Wang, Chongyang
[3
]
Hu, Junjie
[1
]
Deng, Fuqin
[1
]
Lam, Tin Lun
[1
,4
]
机构:
[1] Chinese Univ Hong Kong, Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen 518100, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[3] UCL, London WC1E6BT, England
[4] Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518100, Peoples R China
基金:
国家重点研发计划;
关键词:
Robots;
Task analysis;
Training;
Planning;
Multi-robot systems;
Mathematical models;
Heuristic algorithms;
Asymmetric self-play;
catching systems;
heterogeneous multirobot system (HMRS);
reinforcement learning (RL);
DECENTRALIZED CONTROL;
CONTROL POLICIES;
NEURAL-NETWORKS;
TRACKING;
ROBOTS;
GAME;
VEHICLES;
SEARCH;
GO;
D O I:
10.1109/TRO.2023.3257541
中图分类号:
TP24 [机器人技术];
学科分类号:
080202 ;
1405 ;
摘要:
Aiming to develop a more robust and intelligent heterogeneous system for adversarial catching in security and rescue tasks, in this article, we discuss the specialities of applying asymmetric self-play and curriculum learning techniques to deal with the increasing heterogeneity and number of different robots in modern heterogeneous multirobot systems (HMRS). Our method, based on actor-critic multiagent reinforcement learning, provides a framework that can enable cooperative behaviors among heterogeneous multirobot teams. This leads to the development of an HMRS for complex catching scenarios that involve several robot teams and real-world constraints. We conduct simulated experiments to evaluate different mechanisms' influence on our method's performance, and real-world experiments to assess our system's performance in complex real-world catching problems. In addition, a bridging study is conducted to compare our method with a state-of-the-art method called S2M2 in heterogeneous catching problems, and our method performs better in adversarial settings. As a result, we show that the proposed framework, through fusing asymmetric self-play and curriculum learning during training, is able to successfully complete the HMRS catching task under realistic constraints in both simulation and the real world, thus providing a direction for future large-scale intelligent security & rescue HMRS.
引用
收藏
页码:2603 / 2622
页数:20
相关论文