RSMDP-BASED ROBUST Q-LEARNING FOR OPTIMAL PATH PLANNING IN A DYNAMIC ENVIRONMENT

被引:4
|
作者
Zhang, Yunfei [1 ]
Li, Weilin [2 ]
de Silva, Clarence W. [1 ]
机构
[1] Univ British Columbia, Dept Mech Engn, Vancouver, BC V5Z 1M9, Canada
[2] Northwestern Polytech Univ, Dept Elect Engn, Xian, Peoples R China
来源
基金
加拿大自然科学与工程研究理事会; 加拿大创新基金会;
关键词
Online Q-learning; optimal path planning; probabilistic roadmap; Markov decision process; unknown dynamic obstacles; OBSTACLE AVOIDANCE; ROBOT;
D O I
10.2316/Journal.206.2016.4.206-4255
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a robust Q-learning method for path planning in a dynamic environment. The method consists of three steps: at first, a regime-switching Markov decision process (RSMDP) is formed to present the dynamic environment; and second, a probabilistic roadmap (PRM) is constructed, integrated with the RSMDP, and stored as a graph whose nodes correspond to a collision-free world state for the robot; finally, an online Q-learning method with dynamic step size, which facilitates robust convergence of the Q-value iteration, is integrated with the PRM to determine an optimal path for reaching the goal. In this manner, the robot is able to use past experience for improving its performance in avoiding not only static obstacles but also moving obstacles, without knowing the nature of the obstacle motion. The use of regime switching in the avoidance of obstacles with unknown motion is particularly innovative. The developed approach is applied to a homecare robot in computer simulation. The results show that the online path planner with Q-learning is able to rapidly and successfully converge to the correct path.
引用
收藏
页码:290 / 300
页数:11
相关论文
共 50 条
  • [1] RSMDP-based robust Q-learning for optimal path planning in a dynamic environment
    Zhang, Yunfei
    Li, Weilin
    De Silva, Clarence W.
    [J]. International Journal of Robotics and Automation, 2016, 31 (04) : 290 - 300
  • [2] A Path-Planning Approach Based on Potential and Dynamic Q-Learning for Mobile Robots in Unknown Environment
    Hao, Bing
    Du, He
    Zhao, Jianshuo
    Zhang, Jiamin
    Wang, Qi
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [3] A novel Q-Learning Algorithm Based on the Stochastic Environment Path Planning Problem
    Jian, Li
    Rong, Fei
    Yu, Tang
    [J]. 2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 1977 - 1982
  • [4] Optimal path planning approach based on Q-learning algorithm for mobile robots
    Maoudj, Abderraouf
    Hentout, Abdelfetah
    [J]. APPLIED SOFT COMPUTING, 2020, 97
  • [5] Model based path planning using Q-Learning
    Sharma, Avinash
    Gupta, Kanika
    Kumar, Anirudha
    Sharma, Aishwarya
    Kumar, Rajesh
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2017, : 837 - 842
  • [6] Path Planning Using Wasserstein Distributionally Robust Deep Q-learning
    Alpturk, Cem
    Renganathan, Venkatraman
    [J]. 2023 EUROPEAN CONTROL CONFERENCE, ECC, 2023,
  • [7] Optimal path planning method based on epsilon-greedy Q-learning algorithm
    Bulut, Vahide
    [J]. JOURNAL OF THE BRAZILIAN SOCIETY OF MECHANICAL SCIENCES AND ENGINEERING, 2022, 44 (03)
  • [8] Optimal path planning method based on epsilon-greedy Q-learning algorithm
    Vahide Bulut
    [J]. Journal of the Brazilian Society of Mechanical Sciences and Engineering, 2022, 44
  • [9] A path planning approach for unmanned surface vehicles based on dynamic and fast Q-learning
    Hao, Bing
    Du, He
    Yan, Zheping
    [J]. OCEAN ENGINEERING, 2023, 270
  • [10] Dynamic Path Planning of a Mobile Robot with Improved Q-Learning algorithm
    Li, Siding
    Xu, Xin
    Zuo, Lei
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 409 - 414