Agent Maze Path Planning Based on Simulated Annealing Q-Learning Algorithm

被引：0

作者：

Mao, Zhongtian ^{[1
]}

Wu, Zipeng ^{[1
]}

Fang, Xiaohan ^{[1
]}

Cheng, Songsong ^{[1
]}

Fan, Yuan ^{[1
]}

机构：

[1] Anhui Univ, Anhui Engn Lab Human Robot Integrat Syst & Intell, Sch Elect Engn & Automat, Hefei 230601, Peoples R China

来源：

2022 41ST CHINESE CONTROL CONFERENCE (CCC) | 2022年

基金：

中国国家自然科学基金;

关键词：

Reinforcement learning; Q-Learning; Simulated annealing algorithm; Maze path planning;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The problem of path exploration and planning of agents in unknown environments is a popular application problem in the field of reinforcement learning. In this paper, we propose an improved reinforcement learning algorithm called the QLearning algorithm for adaptive exploration based on simulated annealing (AE-SAQL). We apply the algorithm to the agent path planning problem, improve the setting of the reward function and add the feedback information of the environment. By simulating the Metropolis criterion in the annealing algorithm and adding an adaptive adjustment mechanism, the agent fully explores the environment and makes full use of the environmental information, solving the exploration-utilization dilemma during the algorithm and finally enabling the agent to reach the target location safely. Compared with the standard Q-Learning algorithm and SARSA algorithm, AE-SAQL achieves better.

引用

下载

页码：2272 / 2276

页数：5

共 50 条

[21] UAV path planning algorithm based on Deep Q-Learning to search for a lost in the ocean
Boulares, Mehrez
Fehri, Afef
Jemni, Mohamed
ROBOTICS AND AUTONOMOUS SYSTEMS, 2024, 179
[22] Path planning for mobile robot based on improved ant colony Q-learning algorithm
Mengru Cui
Maowei He
Hanning Chen
Kunpeng Liu
Yabao Hu
Chen Zheng
Xuliang Wang
International Journal on Interactive Design and Manufacturing (IJIDeM), 2025, 19 (4): : 3069 - 3087
[23] Optimal path planning method based on epsilon-greedy Q-learning algorithm
Vahide Bulut
Journal of the Brazilian Society of Mechanical Sciences and Engineering, 2022, 44
[24] Simulated annealing Q-learning algorithm for ABR traffic control of ATM networks
Li, Xin
Zhou, Yucheng
Dimirovski, Georgi M.
Jing, Yuanwei
2008 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2008, : 4462 - +
[25] Path planning of UAV using guided enhancement Q-learning algorithm
Zhou B.
Guo Y.
Li N.
Zhong X.
Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2021, 42 (09):
[26] Dynamic Path Planning of a Mobile Robot with Improved Q-Learning algorithm
Li, Siding
Xu, Xin
Zuo, Lei
2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 409 - 414
[27] Extended Q-Learning Algorithm for Path-Planning of a Mobile Robot
Goswami , Indrani
Das, Pradipta Kumar
Konar, Amit
Janarthanan, R.
SIMULATED EVOLUTION AND LEARNING, 2010, 6457 : 379 - +
[28] An optimized Q-Learning algorithm for mobile robot local path planning
Zhou, Qian
Lian, Yang
Wu, Jiayang
Zhu, Mengyue
Wang, Haiyong
Cao, Jinli
KNOWLEDGE-BASED SYSTEMS, 2024, 286
[29] Ant colony pheromone aided Q-learning path planning algorithm
Tian X.-H.
Huo X.
Zhou D.-L.
Zhao H.
Kongzhi yu Juece/Control and Decision, 2023, 38 (12): : 3345 - 3353
[30] Synergism of Firefly Algorithm and Q-Learning for Robot Arm Path Planning
Sadhu, Arup Kumar
Konar, Amit
Bhattacharjee, Tanuka
Das, Swagatam
SWARM AND EVOLUTIONARY COMPUTATION, 2018, 43 : 50 - 68

← 1 2 3 4 5 →