Asymmetric Self-Play-Enabled Intelligent Heterogeneous Multirobot Catching System Using Deep Multiagent Reinforcement Learning

被引:12
|
作者
Gao, Yuan [1 ]
Chen, Junfeng [1 ]
Chen, Xi [2 ]
Wang, Chongyang [3 ]
Hu, Junjie [1 ]
Deng, Fuqin [1 ]
Lam, Tin Lun [1 ,4 ]
机构
[1] Chinese Univ Hong Kong, Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen 518100, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[3] UCL, London WC1E6BT, England
[4] Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen 518100, Peoples R China
基金
国家重点研发计划;
关键词
Robots; Task analysis; Training; Planning; Multi-robot systems; Mathematical models; Heuristic algorithms; Asymmetric self-play; catching systems; heterogeneous multirobot system (HMRS); reinforcement learning (RL); DECENTRALIZED CONTROL; CONTROL POLICIES; NEURAL-NETWORKS; TRACKING; ROBOTS; GAME; VEHICLES; SEARCH; GO;
D O I
10.1109/TRO.2023.3257541
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Aiming to develop a more robust and intelligent heterogeneous system for adversarial catching in security and rescue tasks, in this article, we discuss the specialities of applying asymmetric self-play and curriculum learning techniques to deal with the increasing heterogeneity and number of different robots in modern heterogeneous multirobot systems (HMRS). Our method, based on actor-critic multiagent reinforcement learning, provides a framework that can enable cooperative behaviors among heterogeneous multirobot teams. This leads to the development of an HMRS for complex catching scenarios that involve several robot teams and real-world constraints. We conduct simulated experiments to evaluate different mechanisms' influence on our method's performance, and real-world experiments to assess our system's performance in complex real-world catching problems. In addition, a bridging study is conducted to compare our method with a state-of-the-art method called S2M2 in heterogeneous catching problems, and our method performs better in adversarial settings. As a result, we show that the proposed framework, through fusing asymmetric self-play and curriculum learning during training, is able to successfully complete the HMRS catching task under realistic constraints in both simulation and the real world, thus providing a direction for future large-scale intelligent security & rescue HMRS.
引用
收藏
页码:2603 / 2622
页数:20
相关论文
共 21 条
  • [1] A Deep Reinforcement Learning Approach Using Asymmetric Self-Play for Robust Multirobot Flocking
    Jia, Yunjie
    Song, Yong
    Cheng, Jiyu
    Jin, Jiong
    Zhang, Wei
    Yang, Simon X.
    Kwong, Sam
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2025,
  • [2] An approach to the pursuit problem on a heterogeneous multiagent system using reinforcement learning
    Ishiwaka, Y
    Sato, T
    Kakazu, Y
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2003, 43 (04) : 245 - 256
  • [3] An IoT enabled smart healthcare system using deep reinforcement learning
    Jagannath, Duraiswamy Jothinath
    Dolly, Raveena Judie
    Let, Gunamony Shine
    Peter, James Dinesh
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (28):
  • [4] Mastering Fighting Game Using Deep Reinforcement Learning With Self-play
    Kim, Dae-Wook
    Park, Sungyun
    Yang, Seong-il
    2020 IEEE CONFERENCE ON GAMES (IEEE COG 2020), 2020, : 576 - 583
  • [5] Intelligent Residential Energy Management System Using Deep Reinforcement Learning
    Mathew, Alwyn
    Roy, Abhijit
    Mathew, Jimson
    IEEE SYSTEMS JOURNAL, 2020, 14 (04): : 5362 - 5372
  • [6] Age of Information Optimization in UAV-enabled Intelligent Transportation System via Deep Reinforcement Learning
    Li, Xinmin
    Li, Jiahui
    Yin, Baolin
    Yan, Jiaxin
    Fang, Yuan
    2022 IEEE 96TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2022-FALL), 2022,
  • [7] Air combat intelligent decision-making method based on self-play and deep reinforcement learning
    Shan, Shengzhe
    Zhang, Weiwei
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2024, 45 (04):
  • [8] Fuzzy Inference Enabled Deep Reinforcement Learning-Based Traffic Light Control for Intelligent Transportation System
    Kumar, Neetesh
    Rahman, Syed Shameerur
    Dhakad, Navin
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (08) : 4919 - 4928
  • [9] Optimal Robust Output Containment of Unknown Heterogeneous Multiagent System Using Off-Policy Reinforcement Learning
    Zuo, Shan
    Song, Yongduan
    Lewis, Frank L.
    Davoudi, Ali
    IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (11) : 3197 - 3207
  • [10] Digital twin-enabled self-evolved optical transceiver using deep reinforcement learning
    Li, Jin
    Wang, Danshi
    Zhang, Min
    Cui, Siheng
    OPTICS LETTERS, 2020, 45 (16) : 4654 - 4657