RSMDP-BASED ROBUST Q-LEARNING FOR OPTIMAL PATH PLANNING IN A DYNAMIC ENVIRONMENT

被引：4

作者：

Zhang, Yunfei ^{[1
]}

Li, Weilin ^{[2
]}

de Silva, Clarence W. ^{[1
]}

机构：

[1] Univ British Columbia, Dept Mech Engn, Vancouver, BC V5Z 1M9, Canada

[2] Northwestern Polytech Univ, Dept Elect Engn, Xian, Peoples R China

来源：

INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION | 2016年 / 31卷 / 04期

基金：

加拿大自然科学与工程研究理事会; 加拿大创新基金会;

关键词：

Online Q-learning; optimal path planning; probabilistic roadmap; Markov decision process; unknown dynamic obstacles; OBSTACLE AVOIDANCE; ROBOT;

D O I：

10.2316/Journal.206.2016.4.206-4255

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a robust Q-learning method for path planning in a dynamic environment. The method consists of three steps: at first, a regime-switching Markov decision process (RSMDP) is formed to present the dynamic environment; and second, a probabilistic roadmap (PRM) is constructed, integrated with the RSMDP, and stored as a graph whose nodes correspond to a collision-free world state for the robot; finally, an online Q-learning method with dynamic step size, which facilitates robust convergence of the Q-value iteration, is integrated with the PRM to determine an optimal path for reaching the goal. In this manner, the robot is able to use past experience for improving its performance in avoiding not only static obstacles but also moving obstacles, without knowing the nature of the obstacle motion. The use of regime switching in the avoidance of obstacles with unknown motion is particularly innovative. The developed approach is applied to a homecare robot in computer simulation. The results show that the online path planner with Q-learning is able to rapidly and successfully converge to the correct path.

引用

页码：290 / 300

页数：11

共 50 条

[1] RSMDP-based robust Q-learning for optimal path planning in a dynamic environment
Zhang, Yunfei
Li, Weilin
De Silva, Clarence W.
[J]. International Journal of Robotics and Automation, 2016, 31 (04) : 290 - 300
[2] A Path-Planning Approach Based on Potential and Dynamic Q-Learning for Mobile Robots in Unknown Environment
Hao, Bing
Du, He
Zhao, Jianshuo
Zhang, Jiamin
Wang, Qi
[J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
[3] A novel Q-Learning Algorithm Based on the Stochastic Environment Path Planning Problem
Jian, Li
Rong, Fei
Yu, Tang
[J]. 2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 1977 - 1982
[4] Optimal path planning approach based on Q-learning algorithm for mobile robots
Maoudj, Abderraouf
Hentout, Abdelfetah
[J]. APPLIED SOFT COMPUTING, 2020, 97 (97)
[5] Model based path planning using Q-Learning
Sharma, Avinash
Gupta, Kanika
Kumar, Anirudha
Sharma, Aishwarya
Kumar, Rajesh
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2017, : 837 - 842
[6] Path Planning Using Wasserstein Distributionally Robust Deep Q-learning
Alpturk, Cem
Renganathan, Venkatraman
[J]. 2023 EUROPEAN CONTROL CONFERENCE, ECC, 2023,
[7] Optimal path planning method based on epsilon-greedy Q-learning algorithm
Vahide Bulut
[J]. Journal of the Brazilian Society of Mechanical Sciences and Engineering, 2022, 44
[8] Optimal path planning method based on epsilon-greedy Q-learning algorithm
Bulut, Vahide
[J]. JOURNAL OF THE BRAZILIAN SOCIETY OF MECHANICAL SCIENCES AND ENGINEERING, 2022, 44 (03)
[9] A path planning approach for unmanned surface vehicles based on dynamic and fast Q-learning
Hao, Bing
Du, He
Yan, Zheping
[J]. OCEAN ENGINEERING, 2023, 270
[10] Dynamic Path Planning of a Mobile Robot with Improved Q-Learning algorithm
Li, Siding
Xu, Xin
Zuo, Lei
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 409 - 414

← 1 2 3 4 5 →