A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement learning based on transfer learning

被引:35
|
作者
Wen, Shuhuan [1 ,2 ]
Wen, Zeteng [1 ,2 ]
Zhang, Di [1 ,2 ]
Zhang, Hong [3 ]
Wang, Tao [1 ,2 ]
机构
[1] Yanshan Univ, Engn Res Ctr, Minist Educ Intelligent Control Syst & Intelligen, Qinhuangdao 066004, Hebei, Peoples R China
[2] Yanshan Univ, Key Lab Ind Comp Control Engn Hebei Prov, Qinhuangdao 066004, Hebei, Peoples R China
[3] Southern Univ Sci & Technol, Dept Elect & Elect Engn, Shenzhen 518000, Peoples R China
关键词
Multi-robot system; Path planning; Deep reinforcement learning; Meta learning; Transfer learning;
D O I
10.1016/j.asoc.2021.107605
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The adaptability of multi-robot systems in complex environments is a hot topic. Aiming at static and dynamic obstacles in complex environments, this paper presents dynamic proximal meta policy optimization with covariance matrix adaptation evolutionary strategies (dynamic-PMPO-CMA) to avoid obstacles and realize autonomous navigation. Firstly, we propose dynamic proximal policy optimization with covariance matrix adaptation evolutionary strategies (dynamic-PPO-CMA) based on original proximal policy optimization (PPO) to obtain a valid policy of obstacles avoidance. The simulation results show that the proposed dynamic-PPO-CMA can avoid obstacles and reach the designated target position successfully. Secondly, in order to improve the adaptability of multi-robot systems in different environments, we integrate meta-learning with dynamic-PPO-CMA to form the dynamic-PMPO-CMA algorithm. In training process, we use the proposed dynamic-PMPO-CMA to train robots to learn multi-task policy. Finally, in testing process, transfer learning is introduced to the proposed dynamic-PMPO-CMA algorithm. The trained parameters of meta policy are transferred to new environments and regarded as the initial parameters. The simulation results show that the proposed algorithm can have faster convergence rate and arrive the destination more quickly than PPO, PMPO and dynamic-PPO-CMA. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Multi-robot path planning based on a deep reinforcement learning DQN algorithm
    Yang Yang
    Li Juntao
    Peng Lingling
    [J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2020, 5 (03) : 177 - 183
  • [2] Multi-Robot Path Planning Method Using Reinforcement Learning
    Bae, Hyansu
    Kim, Gidong
    Kim, Jonguk
    Qian, Dianwei
    Lee, Sukgyu
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (15):
  • [3] A complete multi-robot path-planning algorithm
    Alotaibi, Ebtehal Turki Saho
    Al-Rawi, Hisham
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2018, 32 (05) : 693 - 740
  • [4] A complete multi-robot path-planning algorithm
    Ebtehal Turki Saho Alotaibi
    Hisham Al-Rawi
    [J]. Autonomous Agents and Multi-Agent Systems, 2018, 32 : 693 - 740
  • [5] An Adaptive Memetic Algorithm for Multi-robot Path-Planning
    Rakshit, Pratyusha
    Banerjee, Dhrubojyoti
    Konar, Amit
    Janarthanan, Ramadoss
    [J]. SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING, (SEMCCO 2012), 2012, 7677 : 248 - 258
  • [6] Mobile Robot Navigation System Using Reinforcement Learning with Path Planning Algorithm
    Andarge, E.W.
    Ordys, A.
    Abebe, Y.M.
    [J]. Acta Physica Polonica A, 2024, 146 (04) : 452 - 456
  • [7] Navigation and Path Planning Using Reinforcement Learning for a Roomba Robot
    Romero-Marti, Daniel Paul
    Nunez-Varela, Jose Ignacio
    Soubervielle-Montalvo, Carlos
    Orozco-de-la-Paz, Alfredo
    [J]. 2016 XVIII CONGRESO MEXICANO DE ROBOTICA (COMROB 2016), 2016,
  • [8] A Complete Multi-Robot Path-Planning Algorithm JAAMAS Track
    Alotaibi, Ebtehal Turki Saho
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 158 - 160
  • [9] Robot path planning algorithm based on reinforcement learning
    Zhang, Fuhai
    Li, Ning
    Yuan, Rupeng
    Fu, Yili
    [J]. Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2018, 46 (12): : 65 - 70
  • [10] Multi-robot path planning using learning-based Artificial Bee Colony algorithm
    Cui, Yibing
    Hu, Wei
    Rahmani, Ahmed
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 129