A novel Q-learning algorithm based on improved whale optimization algorithm for path planning

被引：6

作者：

Li, Ying ^{[1
,2
]}

Wang, Hanyu ^{[1
,2
]}

Fan, Jiahao ^{[3
]}

Geng, Yanyu ^{[1
,2
]}

机构：

[1] Jilin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China

[2] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun, Peoples R China

[3] Sichuan Univ, Coll Comp Sci, Chengdu, Peoples R China

来源：

PLOS ONE | 2022年 / 17卷 / 12期

关键词：

MOTH-FLAME OPTIMIZATION; MOBILE ROBOT; MEGAPTERA-NOVAEANGLIAE; HUMPBACK WHALES; NAVIGATION; SONGS; POWER;

D O I：

10.1371/journal.pone.0279438

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow convergence speed. This paper proposes a new Q-learning algorithm called the Paired Whale Optimization Q-learning Algorithm (PWOQLA) which includes four improvements. Firstly, to accelerate the convergence speed of Q-learning, a whale optimization algorithm is used to initialize the values of a Q-table. Before the exploration process, a Q-table which contains previous experience is learned to improve algorithm efficiency. Secondly, to improve the local exploitation capability of the whale optimization algorithm, a paired whale optimization algorithm is proposed in combination with a pairing strategy to speed up the search for prey. Thirdly, to improve the exploration efficiency of Q-learning and reduce the number of useless explorations, a new selective exploration strategy is introduced which considers the relationship between current position and target position. Fourthly, in order to balance the exploration and exploitation capabilities of Q-learning so that it focuses on exploration in the early stage and on exploitation in the later stage, a nonlinear function is designed which changes the value of epsilon in epsilon-greedy Q-learning dynamically based on the number of iterations. Comparing the performance of PWOQLA with other path planning algorithms, experimental results demonstrate that PWOQLA achieves a higher level of accuracy and a faster convergence speed than existing counterparts in mobile robot path planning. The code will be released at https://github.com/wanghanyu0526/improveQL.git.

引用

页数：30

共 50 条

[41] An improved immune Q-learning algorithm
Ji, Zhengqiao
Wu, Q. M. Jonathan
Sid-Ahmed, Maher
2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 3330 - +
[42] Q-learning whale optimization algorithm for test suite generation with constraints support
Hassan, Ali Abdullah
Abdullah, Salwani
Zamli, Kamal Z.
Razali, Rozilawati
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (34): : 24069 - 24090
[43] Q-learning whale optimization algorithm for test suite generation with constraints support
Ali Abdullah Hassan
Salwani Abdullah
Kamal Z. Zamli
Rozilawati Razali
Neural Computing and Applications, 2023, 35 : 24069 - 24090
[44] A novel whale optimization algorithm of path planning strategy for mobile robots
Yaonan Dai
Jiuyang Yu
Cong Zhang
Bowen Zhan
Xiaotao Zheng
Applied Intelligence, 2023, 53 : 10843 - 10857
[45] A novel whale optimization algorithm of path planning strategy for mobile robots
Dai, Yaonan
Yu, Jiuyang
Zhang, Cong
Zhan, Bowen
Zheng, Xiaotao
APPLIED INTELLIGENCE, 2023, 53 (09) : 10843 - 10857
[46] An improved ant colony algorithm based on Q-Learning for route planning of autonomous vehicle
Zhao, Liping
Li, Feng
Sun, Dongye
Zhao, Zihan
INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2024, 19 (03) : 1 - 15
[47] Multi-Node Path Planning of Electric Tractor Based on Improved Whale Optimization Algorithm and Ant Colony Algorithm
Liang, Chuandong
Pan, Kui
Zhao, Mi
Lu, Min
AGRICULTURE-BASEL, 2023, 13 (03):
[48] A Dynamic Planning Algorithm based on Q-Learning Routing in SDON
Shang, Jingkun
Li, Hui
Man, Xiangkun
Wu, Fang
Zhao, Jia Wei
Ma, Xiaomei
2020 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP) AND INTERNATIONAL CONFERENCE ON INFORMATION PHOTONICS AND OPTICAL COMMUNICATIONS (IPOC), 2020,
[49] Research on path planning of autonomous vehicle based on RRT algorithm of Q-learning and obstacle distribution
Shang, Yuze
Liu, Fei
Qin, Ping
Guo, Zhizhong
Li, Zhe
ENGINEERING COMPUTATIONS, 2023, 40 (05) : 1266 - 1286
[50] Path Planning of Intelligent Radar Anti-jamming Matrix based on Q-Learning Algorithm
Shi, Shasha
Zhou, Qingsong
Qian, Jialong
Shi, Shujie
Proceedings of SPIE - The International Society for Optical Engineering, 2024, 13107

← 1 2 3 4 5 →