Scalable Multi-Robot Cooperation for Multi-Goal Tasks Using Reinforcement Learning

被引:0
|
作者
An, Tianxu [1 ]
Lee, Joonho [2 ]
Bjelonic, Marko [3 ]
De Vincenti, Flavio [4 ]
Hutter, Marco [1 ]
机构
[1] Robot Syst Lab, CH-8092 Zurich, Switzerland
[2] Neuromeka Co Ltd, Seoul 04782, South Korea
[3] Swiss Mile Robot AG, CH-8092 Zurich, Switzerland
[4] Swiss Fed Inst Technol, Computat Robot Lab, CH-8092 Zurich, Switzerland
来源
IEEE ROBOTICS AND AUTOMATION LETTERS | 2025年 / 10卷 / 02期
基金
瑞士国家科学基金会; 欧洲研究理事会;
关键词
Robots; Navigation; Training; Neural networks; Collision avoidance; Mobile robots; Reinforcement learning; Quadrupedal robots; Vectors; Scalability; Legged locomotion; multi-robot systems; reinforcement learning;
D O I
10.1109/LRA.2024.3521183
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Coordinated navigation of an arbitrary number of robots to an arbitrary number of goals is a big challenge in robotics, often hindered by scalability limitations of existing strategies. This letter introduces a decentralized multi-agent control system using neural network policies trained in simulation. By leveraging permutation invariant neural network architectures and model-free reinforcement learning, our policy enables robots to prioritize varying numbers of collaborating robots and goals in a zero-shot manner without being biased by ordering or limited by a fixed capacity. We validate the task performance and scalability of our policies through experiments in both simulation and real-world settings. Our approach achieves a 10.3% higher success rate in collaborative navigation tasks compared to a policy without a permutation invariant encoder. Additionally, it finds near-optimal solutions for multi-robot navigation problems while being two orders of magnitude faster than an optimization-based centralized controller. We deploy our multi-goal navigation policies on two wheeled-legged quadrupedal robots, which successfully complete a series of multi-goal navigation missions.
引用
收藏
页码:1585 / 1592
页数:8
相关论文
共 50 条
  • [31] Efficient Multi-Goal Reinforcement Learning via Value Consistency Prioritization
    Xu, Jiawei
    Li, Shuxing
    Yang, Rui
    Yuan, Chun
    Han, Lei
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2023, 77 : 355 - 376
  • [32] Planning multi-goal tours for robot arms
    Saha, M
    Sánchez-Ante, G
    Latombe, JC
    2003 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-3, PROCEEDINGS, 2003, : 3797 - 3803
  • [33] Distributed scalable multi-robot learning using particle swarm optimization
    Pugh J.
    Martinoli A.
    Swarm Intelligence, 2009, 3 (03) : 203 - 222
  • [34] Efficient Multi-Goal Reinforcement Learning via Value Consistency Prioritization
    Xu J.
    Li S.
    Yang R.
    Yuan C.
    Han L.
    Journal of Artificial Intelligence Research, 2023, 77 : 355 - 376
  • [35] Multi-Robot Cooperation Based on Continuous Reinforcement Learning with Two State Space Representations
    Yasuda, Toshiyuki
    Ohkura, Kazuhiro
    Yamada, Kazuaki
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 4470 - 4475
  • [36] Safe multi-agent reinforcement learning for multi-robot control
    Gu, Shangding
    Kuba, Jakub Grudzien
    Chen, Yuanpei
    Du, Yali
    Yang, Long
    Knoll, Alois
    Yang, Yaodong
    ARTIFICIAL INTELLIGENCE, 2023, 319
  • [37] Multi-robot concurrent learning of fuzzy rules for cooperation
    Liu, Z
    Ang, MH
    Seah, WKG
    2005 IEEE INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN ROBOTICS AND AUTOMATION, PROCEEDINGS, 2005, : 713 - 719
  • [38] Counterexample Guided Abstraction Refinement with Non-Refined Abstractions for Multi-Goal Multi-Robot Path Planning
    Surynek, Pavel
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 7341 - 7347
  • [40] Learning Multi-Goal Dialogue Strategies Using Reinforcement Learning With Reduced State-Action Spaces
    Cuayahuitl, Heriberto
    Renals, Steve
    Lemon, Oliver
    Shimodaira, Hiroshi
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 469 - +