A Partially Observable Monte Carlo Planning Algorithm Based on Path Modification

被引:0
|
作者
Wang, Qingya [1 ]
Liu, Feng [1 ]
Luo, Bin [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Software Inst, Nanjing, Peoples R China
基金
中国国家自然科学基金;
关键词
POMDP; POMCP-PM; Value Updating;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Balancing exploration and exploitation has long been recognized as an important theme in the online planning algorithms for POMDP problems. Explorative actions on one hand prevent the planning from falling into the suboptimal dilemma, while hindering the convergence of the planning procedure on the other hand. Therefore, it is meaningful to maintain the exploration as well as taking a step forward towards exploitation. Note that there is a deviation between the action selection criteria in the planning procedure and in the execution procedure, which inspires us to build a bridge between these two criteria to accelerate the convergence. A Partially Observable Monte Carlo Planning algorithm based on Path Modification (POMCP-PM) is presented in the paper, which modifies the backtracing paths by considering the two criteria simultaneously when updating the values of parent nodes. The algorithm is general as the Upper Confidence Bound Apply to Tree (UCT) algorithm used to select actions can be easily replaced by other criteria. Experimental results demonstrate that POMCP-PM outperforms POMCP with varying numbers of simulations on several scenarios with different scales.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] MEDICAL ROBOT PATH PLANNING ALGORITHM BASED ON PARTIALLY OBSERVABLE MARKOV
    Feng, Y. P.
    Zheng, H. Y.
    Wu, B.
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2015, 117 : 44 - 44
  • [2] Influence of State-Variable Constraints on Partially Observable Monte Carlo Planning
    Castellini, Alberto
    Chalkiadakis, Georgios
    Farinelli, Alessandro
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 5540 - 5546
  • [3] Risk-aware shielding of Partially Observable Monte Carlo Planning policies 
    Mazzi, Giulio
    Castellini, Alberto
    Farinelli, Alessandro
    ARTIFICIAL INTELLIGENCE, 2023, 324
  • [4] Partially Observable Monte Carlo Planning with state variable constraints for mobile robot navigation
    Castellini, Alberto
    Marchesini, Enrico
    Farinelli, Alessandro
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 104
  • [5] Online Cyber Deception System Using Partially Observable Monte-Carlo Planning Framework
    Al Amin, Md Ali Reza
    Shetty, Sachin
    Njilla, Laurent
    Tosh, Deepak K.
    Kamhoua, Charles
    SECURITY AND PRIVACY IN COMMUNICATION NETWORKS, SECURECOMM, PT II, 2019, 305 : 205 - 223
  • [6] Analysis of Path Planning Method based on Monte-Carlo
    Zhang Da-qiao
    Lei Gang
    Xian Yong
    Wang Ming-hai
    2ND IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL (ICACC 2010), VOL. 5, 2010, : 176 - 180
  • [7] A Myopic Monte Carlo Strategy for the Partially Observable Travelling Salesman Problem
    Buck, Andrew R.
    Keller, James M.
    2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2016, : 632 - 639
  • [8] Multiple Tree for Partially Observable Monte-Carlo Tree Search
    Auger, David
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, PT I, 2011, 6624 : 53 - 62
  • [9] Monte-Carlo-based partially observable Markov decision process approximations for adaptive sensing
    Chong, Edwin K. P.
    Kreucher, Christopher M.
    Hero, Alfred O., III
    WODES' 08: PROCEEDINGS OF THE 9TH INTERNATIONAL WORKSHOP ON DISCRETE EVENT SYSTEMS, 2008, : 173 - +
  • [10] Monte-Carlo Robot Path Planning
    Dam, Tuan
    Chalvatzaki, Georgia
    Peters, Jan
    Pajarinen, Joni
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 11213 - 11220