A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement learning based on transfer learning

被引：35

作者：

Wen, Shuhuan ^{[1
,2
]}

Wen, Zeteng ^{[1
,2
]}

Zhang, Di ^{[1
,2
]}

Zhang, Hong ^{[3
]}

Wang, Tao ^{[1
,2
]}

机构：

[1] Yanshan Univ, Engn Res Ctr, Minist Educ Intelligent Control Syst & Intelligen, Qinhuangdao 066004, Hebei, Peoples R China

[2] Yanshan Univ, Key Lab Ind Comp Control Engn Hebei Prov, Qinhuangdao 066004, Hebei, Peoples R China

[3] Southern Univ Sci & Technol, Dept Elect & Elect Engn, Shenzhen 518000, Peoples R China

来源：

APPLIED SOFT COMPUTING | 2021年 / 110卷 / 110期

关键词：

Multi-robot system; Path planning; Deep reinforcement learning; Meta learning; Transfer learning;

D O I：

10.1016/j.asoc.2021.107605

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The adaptability of multi-robot systems in complex environments is a hot topic. Aiming at static and dynamic obstacles in complex environments, this paper presents dynamic proximal meta policy optimization with covariance matrix adaptation evolutionary strategies (dynamic-PMPO-CMA) to avoid obstacles and realize autonomous navigation. Firstly, we propose dynamic proximal policy optimization with covariance matrix adaptation evolutionary strategies (dynamic-PPO-CMA) based on original proximal policy optimization (PPO) to obtain a valid policy of obstacles avoidance. The simulation results show that the proposed dynamic-PPO-CMA can avoid obstacles and reach the designated target position successfully. Secondly, in order to improve the adaptability of multi-robot systems in different environments, we integrate meta-learning with dynamic-PPO-CMA to form the dynamic-PMPO-CMA algorithm. In training process, we use the proposed dynamic-PMPO-CMA to train robots to learn multi-task policy. Finally, in testing process, transfer learning is introduced to the proposed dynamic-PMPO-CMA algorithm. The trained parameters of meta policy are transferred to new environments and regarded as the initial parameters. The simulation results show that the proposed algorithm can have faster convergence rate and arrive the destination more quickly than PPO, PMPO and dynamic-PPO-CMA. (C) 2021 Elsevier B.V. All rights reserved.

引用

页数：15

共 50 条

[1] Multi-robot path planning based on a deep reinforcement learning DQN algorithm
Yang Yang
Li Juntao
Peng Lingling
[J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2020, 5 (03) : 177 - 183
[2] Multi-Robot Path Planning Method Using Reinforcement Learning
Bae, Hyansu
Kim, Gidong
Kim, Jonguk
Qian, Dianwei
Lee, Sukgyu
[J]. APPLIED SCIENCES-BASEL, 2019, 9 (15):
[3] A complete multi-robot path-planning algorithm
Alotaibi, Ebtehal Turki Saho
Al-Rawi, Hisham
[J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2018, 32 (05) : 693 - 740
[4] A complete multi-robot path-planning algorithm
Ebtehal Turki Saho Alotaibi
Hisham Al-Rawi
[J]. Autonomous Agents and Multi-Agent Systems, 2018, 32 : 693 - 740
[5] An Adaptive Memetic Algorithm for Multi-robot Path-Planning
Rakshit, Pratyusha
Banerjee, Dhrubojyoti
Konar, Amit
Janarthanan, Ramadoss
[J]. SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING, (SEMCCO 2012), 2012, 7677 : 248 - 258
[6] Mobile Robot Navigation System Using Reinforcement Learning with Path Planning Algorithm
Andarge, E.W.
Ordys, A.
Abebe, Y.M.
[J]. Acta Physica Polonica A, 2024, 146 (04) : 452 - 456
[7] Navigation and Path Planning Using Reinforcement Learning for a Roomba Robot
Romero-Marti, Daniel Paul
Nunez-Varela, Jose Ignacio
Soubervielle-Montalvo, Carlos
Orozco-de-la-Paz, Alfredo
[J]. 2016 XVIII CONGRESO MEXICANO DE ROBOTICA (COMROB 2016), 2016,
[8] A Complete Multi-Robot Path-Planning Algorithm JAAMAS Track
Alotaibi, Ebtehal Turki Saho
[J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 158 - 160
[9] Robot path planning algorithm based on reinforcement learning
Zhang, Fuhai
Li, Ning
Yuan, Rupeng
Fu, Yili
[J]. Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2018, 46 (12): : 65 - 70
[10] Multi-robot path planning using learning-based Artificial Bee Colony algorithm
Cui, Yibing
Hu, Wei
Rahmani, Ahmed
[J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 129

← 1 2 3 4 5 →