An Adaptive Conversion Speed Q-Learning Algorithm for Search and Rescue UAV Path Planning in Unknown Environments

被引：8

作者：

Wu, Jiehong ^{[1
]}

Sun, Ya'nan ^{[1
]}

Li, Danyang ^{[1
]}

Shi, Junling ^{[1
]}

Li, Xianwei ^{[2
]}

Gao, Lijun ^{[1
]}

Yu, Lei ^{[1
]}

Han, Guangjie ^{[3
,4
]}

Wu, Jinsong ^{[5
,6
]}

机构：

[1] Shenyang Aerosp Univ, Sch Comp Sci, Shenyang 110136, Peoples R China

[2] Bengbu Univ, Sch Comp Sci & Informat Engn, Bengbu 233030, Peoples R China

[3] Hohai Univ, Dept Internet Things Engn, Changzhou 213022, Peoples R China

[4] Chinese Acad Sci, Inst Acoust, State Key Lab Acoust, Beijing 100190, Peoples R China

[5] Guilin Univ Elect Technol, Sch Artificial Intelligence, Guilin 541004, Peoples R China

[6] Univ Chile, Dept Elect Engn, Santiago 9170124, Chile

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2023年 / 72卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Adaptive conversion speed; path planning; Q-Learning; search and rescue; unmanned aerial vehicle (UAV);

D O I：

10.1109/TVT.2023.3297837

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

With the wide application of unmanned aerial vehicles (UAVs), performing search and rescue missions autonomously in unknown environment has become an increasingly concerning issue. In this article, we propose an adaptive conversion speed Q-Learning algorithm (ACSQL). Performing UAV missions autonomously is divided into two stages: rescue mission search stage and optimal path search stage. In the first stage, a UAV can find task points as soon as possible, and the efficiency of exploration is increased by adaptively adjusting the speed of the UAV. In the second stage, to get a secure and short path, we propose a subdomain search algorithm. Based on the above two stages, we improve state space and action space in reinforcement learning, and design a composite reward function, finally obtain the path of UAV to perform multiple search and rescue missions through this algorithm. In order to solve the problems of slow training convergence and high uncertainty, we initialize the Q-table by combining detection information of UAV sensors in first stage. Simulation results show that ACSQL algorithm can realize autonomous navigation and path planning of UAV in an unknown environment. Compared with traditional action space, the learning process of UAV converges faster and more stable, and it can converge in about 30 episodes. Compared with DDPG algorithm and IDWA algorithm in different scenarios, ACSQL algorithm has the shortest path length. Finally, ACSQL algorithm is verified by UAV simulator Airsim.

引用

页码：15391 / 15404

页数：14

共 50 条

[1] UAV path planning algorithm based on Deep Q-Learning to search for a lost in the ocean
Boulares, Mehrez
Fehri, Afef
Jemni, Mohamed
[J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2024, 179
[2] A Path Planning Algorithm for UAV Based on Improved Q-Learning
Yan, Chao
Xiang, Xiaojia
[J]. 2018 2ND INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION SCIENCES (ICRAS), 2018, : 46 - 50
[3] Path planning of UAV using guided enhancement Q-learning algorithm
Zhou, Bin
Guo, Yan
Li, Ning
Zhong, Xijian
[J]. Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2021, 42 (09):
[4] Hybrid Path Planning of A Quadrotor UAV Based on Q-Learning Algorithm
Zhang, Tianze
Huo, Xin
Chen, Songlin
Yang, Baoqing
Zhang, Guojiang
[J]. 2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 5415 - 5419
[5] The Adaptive Vortex Search Algorithm of Optimal Path Planning for Forest Fire Rescue UAV
Wang, Chunying
Liu, Ping
Zhang, Tongxun
Sun, Jinju
[J]. PROCEEDINGS OF 2018 IEEE 3RD ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC 2018), 2018, : 400 - 403
[6] The Experience-Memory Q-Learning Algorithm for Robot Path Planning in Unknown Environment
Zhao, Meng
Lu, Hui
Yang, Siyi
Guo, Fengjuan
[J]. IEEE ACCESS, 2020, 8 : 47824 - 47844
[7] An adaptive Q-learning based particle swarm optimization for multi-UAV path planning
Li Tan
Hongtao Zhang
Yuzhao Liu
Tianli Yuan
Xujie Jiang
Ziliang Shang
[J]. Soft Computing, 2024, 28 (13-14) : 7931 - 7946
[8] Online path planning of cooperative mobile robots in unknown environments using improved Q-Learning and adaptive artificial potential field
Ataollahi, Melika
Farrokhi, Mohammad
[J]. JOURNAL OF ENGINEERING-JOE, 2023, 2023 (02):
[9] UAV Motion Strategies in Uncertain Dynamic Environments: A Path Planning Method Based on Q-Learning Strategy
Cui, Jun-hui
Wei, Rui-xuan
Liu, Zong-cheng
Zhou, Kai
[J]. APPLIED SCIENCES-BASEL, 2018, 8 (11):
[10] Adaptive sensor-planning algorithm with Q-learning
Maeda, M
Kato, N
Kashimura, H
[J]. 2004 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2004, : 966 - 969

← 1 2 3 4 5 →