A novel Q-learning algorithm based on improved whale optimization algorithm for path planning

被引:6
|
作者
Li, Ying [1 ,2 ]
Wang, Hanyu [1 ,2 ]
Fan, Jiahao [3 ]
Geng, Yanyu [1 ,2 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China
[2] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun, Peoples R China
[3] Sichuan Univ, Coll Comp Sci, Chengdu, Peoples R China
来源
PLOS ONE | 2022年 / 17卷 / 12期
关键词
MOTH-FLAME OPTIMIZATION; MOBILE ROBOT; MEGAPTERA-NOVAEANGLIAE; HUMPBACK WHALES; NAVIGATION; SONGS; POWER;
D O I
10.1371/journal.pone.0279438
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Q-learning is a classical reinforcement learning algorithm and one of the most important methods of mobile robot path planning without a prior environmental model. Nevertheless, Q-learning is too simple when initializing Q-table and wastes too much time in the exploration process, causing a slow convergence speed. This paper proposes a new Q-learning algorithm called the Paired Whale Optimization Q-learning Algorithm (PWOQLA) which includes four improvements. Firstly, to accelerate the convergence speed of Q-learning, a whale optimization algorithm is used to initialize the values of a Q-table. Before the exploration process, a Q-table which contains previous experience is learned to improve algorithm efficiency. Secondly, to improve the local exploitation capability of the whale optimization algorithm, a paired whale optimization algorithm is proposed in combination with a pairing strategy to speed up the search for prey. Thirdly, to improve the exploration efficiency of Q-learning and reduce the number of useless explorations, a new selective exploration strategy is introduced which considers the relationship between current position and target position. Fourthly, in order to balance the exploration and exploitation capabilities of Q-learning so that it focuses on exploration in the early stage and on exploitation in the later stage, a nonlinear function is designed which changes the value of epsilon in epsilon-greedy Q-learning dynamically based on the number of iterations. Comparing the performance of PWOQLA with other path planning algorithms, experimental results demonstrate that PWOQLA achieves a higher level of accuracy and a faster convergence speed than existing counterparts in mobile robot path planning. The code will be released at https://github.com/wanghanyu0526/improveQL.git.
引用
收藏
页数:30
相关论文
共 50 条
  • [41] An improved immune Q-learning algorithm
    Ji, Zhengqiao
    Wu, Q. M. Jonathan
    Sid-Ahmed, Maher
    2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 3330 - +
  • [42] Q-learning whale optimization algorithm for test suite generation with constraints support
    Hassan, Ali Abdullah
    Abdullah, Salwani
    Zamli, Kamal Z.
    Razali, Rozilawati
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (34): : 24069 - 24090
  • [43] Q-learning whale optimization algorithm for test suite generation with constraints support
    Ali Abdullah Hassan
    Salwani Abdullah
    Kamal Z. Zamli
    Rozilawati Razali
    Neural Computing and Applications, 2023, 35 : 24069 - 24090
  • [44] A novel whale optimization algorithm of path planning strategy for mobile robots
    Yaonan Dai
    Jiuyang Yu
    Cong Zhang
    Bowen Zhan
    Xiaotao Zheng
    Applied Intelligence, 2023, 53 : 10843 - 10857
  • [45] A novel whale optimization algorithm of path planning strategy for mobile robots
    Dai, Yaonan
    Yu, Jiuyang
    Zhang, Cong
    Zhan, Bowen
    Zheng, Xiaotao
    APPLIED INTELLIGENCE, 2023, 53 (09) : 10843 - 10857
  • [46] An improved ant colony algorithm based on Q-Learning for route planning of autonomous vehicle
    Zhao, Liping
    Li, Feng
    Sun, Dongye
    Zhao, Zihan
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2024, 19 (03) : 1 - 15
  • [47] Multi-Node Path Planning of Electric Tractor Based on Improved Whale Optimization Algorithm and Ant Colony Algorithm
    Liang, Chuandong
    Pan, Kui
    Zhao, Mi
    Lu, Min
    AGRICULTURE-BASEL, 2023, 13 (03):
  • [48] A Dynamic Planning Algorithm based on Q-Learning Routing in SDON
    Shang, Jingkun
    Li, Hui
    Man, Xiangkun
    Wu, Fang
    Zhao, Jia Wei
    Ma, Xiaomei
    2020 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP) AND INTERNATIONAL CONFERENCE ON INFORMATION PHOTONICS AND OPTICAL COMMUNICATIONS (IPOC), 2020,
  • [49] Research on path planning of autonomous vehicle based on RRT algorithm of Q-learning and obstacle distribution
    Shang, Yuze
    Liu, Fei
    Qin, Ping
    Guo, Zhizhong
    Li, Zhe
    ENGINEERING COMPUTATIONS, 2023, 40 (05) : 1266 - 1286
  • [50] Path Planning of Intelligent Radar Anti-jamming Matrix based on Q-Learning Algorithm
    Shi, Shasha
    Zhou, Qingsong
    Qian, Jialong
    Shi, Shujie
    Proceedings of SPIE - The International Society for Optical Engineering, 2024, 13107