A novel Q-learning algorithm based on improved whale optimization algorithm for path planning

被引:6
|
作者
Li, Ying [1 ,2 ]
Wang, Hanyu [1 ,2 ]
Fan, Jiahao [3 ]
Geng, Yanyu [1 ,2 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China
[2] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun, Peoples R China
[3] Sichuan Univ, Coll Comp Sci, Chengdu, Peoples R China
来源
PLOS ONE | 2022年 / 17卷 / 12期
关键词
MOTH-FLAME OPTIMIZATION; MOBILE ROBOT; MEGAPTERA-NOVAEANGLIAE; HUMPBACK WHALES; NAVIGATION; SONGS; POWER;
D O I
10.1371/journal.pone.0279438
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow convergence speed. This paper proposes a new Q-learning algorithm called the Paired Whale Optimization Q-learning Algorithm (PWOQLA) which includes four improvements. Firstly, to accelerate the convergence speed of Q-learning, a whale optimization algorithm is used to initialize the values of a Q-table. Before the exploration process, a Q-table which contains previous experience is learned to improve algorithm efficiency. Secondly, to improve the local exploitation capability of the whale optimization algorithm, a paired whale optimization algorithm is proposed in combination with a pairing strategy to speed up the search for prey. Thirdly, to improve the exploration efficiency of Q-learning and reduce the number of useless explorations, a new selective exploration strategy is introduced which considers the relationship between current position and target position. Fourthly, in order to balance the exploration and exploitation capabilities of Q-learning so that it focuses on exploration in the early stage and on exploitation in the later stage, a nonlinear function is designed which changes the value of epsilon in epsilon-greedy Q-learning dynamically based on the number of iterations. Comparing the performance of PWOQLA with other path planning algorithms, experimental results demonstrate that PWOQLA achieves a higher level of accuracy and a faster convergence speed than existing counterparts in mobile robot path planning. The code will be released at https://github.com/wanghanyu0526/improveQL.git.
引用
收藏
页数:30
相关论文
共 50 条
  • [1] A Path Planning Algorithm for UAV Based on Improved Q-Learning
    Yan, Chao
    Xiang, Xiaojia
    [J]. 2018 2ND INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION SCIENCES (ICRAS), 2018, : 46 - 50
  • [2] Coverage Path Planning Optimization Based on Q-Learning Algorithm
    Piardi, Luis
    Lima, Jose
    Pereira, Ana, I
    Costa, Paulo
    [J]. INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS (ICNAAM-2018), 2019, 2116
  • [3] PATH PLANNING OF MOBILE ROBOT BASED ON THE IMPROVED Q-LEARNING ALGORITHM
    Chen, Chaorui
    Wang, Dongshu
    [J]. INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2022, 18 (03): : 687 - 702
  • [4] Indoor Emergency Path Planning Based on the Q-Learning Optimization Algorithm
    Xu, Shenghua
    Gu, Yang
    Li, Xiaoyan
    Chen, Cai
    Hu, Yingyi
    Sang, Yu
    Jiang, Wenxing
    [J]. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2022, 11 (01)
  • [5] ETQ-learning: an improved Q-learning algorithm for path planning
    Wang, Huanwei
    Jing, Jing
    Wang, Qianlv
    He, Hongqi
    Qi, Xuyan
    Lou, Rui
    [J]. INTELLIGENT SERVICE ROBOTICS, 2024, 17 (04) : 915 - 929
  • [6] Path planning for unmanned surface vehicle based on improved Q-Learning algorithm
    Wang, Yuanhui
    Lu, Changzhou
    Wu, Peng
    Zhang, Xiaoyue
    [J]. OCEAN ENGINEERING, 2024, 292
  • [7] Path planning of UAVs based on improved whale optimization algorithm
    Wu K.
    Tan S.
    [J]. Wu, Kun (wukun@buaa.edu.cn), 1600, Chinese Society of Astronautics (41):
  • [8] UAV Path Planning based on Improved Whale Optimization Algorithm
    Liu, Kun
    Xv, Cheng
    Huang, Daqing
    Ye, Xinning
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS AND COMPUTER ENGINEERING (ICCECE), 2021, : 569 - 573
  • [9] A novel Q-Learning Algorithm Based on the Stochastic Environment Path Planning Problem
    Jian, Li
    Rong, Fei
    Yu, Tang
    [J]. 2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 1977 - 1982
  • [10] Dynamic Path Planning of a Mobile Robot with Improved Q-Learning algorithm
    Li, Siding
    Xu, Xin
    Zuo, Lei
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 409 - 414