A Partially Observable Monte Carlo Planning Algorithm Based on Path Modification

被引:0
|
作者
Wang, Qingya [1 ]
Liu, Feng [1 ]
Luo, Bin [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Software Inst, Nanjing, Peoples R China
基金
中国国家自然科学基金;
关键词
POMDP; POMCP-PM; Value Updating;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Balancing exploration and exploitation has long been recognized as an important theme in the online planning algorithms for POMDP problems. Explorative actions on one hand prevent the planning from falling into the suboptimal dilemma, while hindering the convergence of the planning procedure on the other hand. Therefore, it is meaningful to maintain the exploration as well as taking a step forward towards exploitation. Note that there is a deviation between the action selection criteria in the planning procedure and in the execution procedure, which inspires us to build a bridge between these two criteria to accelerate the convergence. A Partially Observable Monte Carlo Planning algorithm based on Path Modification (POMCP-PM) is presented in the paper, which modifies the backtracing paths by considering the two criteria simultaneously when updating the values of parent nodes. The algorithm is general as the Upper Confidence Bound Apply to Tree (UCT) algorithm used to select actions can be easily replaced by other criteria. Experimental results demonstrate that POMCP-PM outperforms POMCP with varying numbers of simulations on several scenarios with different scales.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Active Semantic Localization of Mobile Robot Using Partial Observable Monte Carlo Planning
    Li, Shen
    Xiong, Rong
    Wang, Yue
    2013 10TH IEEE INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION (ICCA), 2013, : 1409 - 1414
  • [22] UAV Path Planning in a Dynamic Environment via Partially Observable Markov Decision Process
    Ragi, Shankarachary
    Chong, Edwin K. P.
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2013, 49 (04) : 2397 - 2412
  • [23] A Monte Carlo-based algorithm for the quickest path flow network reliability problem
    Huang, Cheng-Fu
    ANNALS OF OPERATIONS RESEARCH, 2024,
  • [24] Hybrid Monte Carlo implementation of the Fourier path integral algorithm
    Chakravarty, C
    JOURNAL OF CHEMICAL PHYSICS, 2005, 123 (02):
  • [25] A Novel Hybrid Monte Carlo Algorithm for Sampling Path Space
    Pinski, Francis J.
    ENTROPY, 2021, 23 (05)
  • [26] A Path Planning Method Based on Improved Single Player-Monte Carlo Tree Search
    Xia, Yu-Wei
    Yang, Chao
    Chen, Bing-Qiu
    IEEE ACCESS, 2020, 8 : 163694 - 163702
  • [27] Monte Carlo-based improved ant colony optimization for path planning of welding robot
    Wang, Tiancheng
    Wang, Lei
    Li, Dongdong
    Cai, Jingcao
    Wang, Yixuan
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (07)
  • [28] Bandit based Monte-Carlo planning
    Kocsis, Levente
    Szepesvari, Csaba
    MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 282 - 293
  • [29] A radiosurgery Monte Carlo based treatment planning
    Chave, A
    Lopes, MC
    Oliveira, C
    Peralta, L
    RADIOTHERAPY AND ONCOLOGY, 2004, 73 : S374 - S374
  • [30] Optimization Algorithm Based on Monte Carlo Tree Search for Single Satellite Task Planning
    Zhao, Jie
    Liu, Ruixia
    Han, Yang
    PROCEEDINGS OF THE 2024 3RD INTERNATIONAL SYMPOSIUM ON INTELLIGENT UNMANNED SYSTEMS AND ARTIFICIAL INTELLIGENCE, SIUSAI 2024, 2024, : 264 - 267