A Partially Observable Monte Carlo Planning Algorithm Based on Path Modification

被引:0
|
作者
Wang, Qingya [1 ]
Liu, Feng [1 ]
Luo, Bin [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Software Inst, Nanjing, Peoples R China
基金
中国国家自然科学基金;
关键词
POMDP; POMCP-PM; Value Updating;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Balancing exploration and exploitation has long been recognized as an important theme in the online planning algorithms for POMDP problems. Explorative actions on one hand prevent the planning from falling into the suboptimal dilemma, while hindering the convergence of the planning procedure on the other hand. Therefore, it is meaningful to maintain the exploration as well as taking a step forward towards exploitation. Note that there is a deviation between the action selection criteria in the planning procedure and in the execution procedure, which inspires us to build a bridge between these two criteria to accelerate the convergence. A Partially Observable Monte Carlo Planning algorithm based on Path Modification (POMCP-PM) is presented in the paper, which modifies the backtracing paths by considering the two criteria simultaneously when updating the values of parent nodes. The algorithm is general as the Upper Confidence Bound Apply to Tree (UCT) algorithm used to select actions can be easily replaced by other criteria. Experimental results demonstrate that POMCP-PM outperforms POMCP with varying numbers of simulations on several scenarios with different scales.
引用
收藏
页数:14
相关论文
共 50 条
  • [11] An algorithm to create model file for Partially Observable Markov Decision Process for mobile robot path planning
    Deshpande, Shripad, V
    Harikrishnan, R.
    Sampe, Jahariah
    Patwa, Abhimanyu
    METHODSX, 2024, 12
  • [12] Reducing the Computational Cost of a Monte Carlo Based Planning Algorithm
    Wang, Hao
    Julier, Simon J.
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 3663 - 3668
  • [13] Mobile sensor patrol path planning in partially observable border regions
    Wichai Pawgasame
    Komwut Wipusitwarakun
    Applied Intelligence, 2021, 51 : 5453 - 5473
  • [14] A self-learning Monte Carlo tree search algorithm for robot path planning
    Li, Wei
    Liu, Yi
    Ma, Yan
    Xu, Kang
    Qiu, Jiang
    Gan, Zhongxue
    FRONTIERS IN NEUROROBOTICS, 2023, 17
  • [15] Mobile sensor patrol path planning in partially observable border regions
    Pawgasame, Wichai
    Wipusitwarakun, Komwut
    APPLIED INTELLIGENCE, 2021, 51 (08) : 5453 - 5473
  • [16] Decentralized Monte Carlo Tree Search for Partially Observable Multi-Agent Pathfinding
    Skrynnik, Alexey
    Andreychuk, Anton
    Yakovlev, Konstantin
    Panov, Aleksandr
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17531 - 17540
  • [17] Research on Surface Target Strike Process Planning Based on Monte Carlo Algorithm
    Liu, Wei
    Lv, Mingwei
    Zhang, Shaoqing
    Song, Yike
    Hu, Jinwen
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 3662 - 3666
  • [18] An efficient Monte Carlo algorithm in probabilistic operational planning
    Hermans, A.
    Dogan, G.
    Labeau, P. E.
    Bastiaensen, C.
    2018 IEEE INTERNATIONAL ENERGY CONFERENCE (ENERGYCON), 2018,
  • [19] A Monte Carlo EM approach for partially observable diffusion processes: Theory and applications to neural networks
    Movellan, JR
    Mineiro, P
    Williams, RJ
    NEURAL COMPUTATION, 2002, 14 (07) : 1507 - 1544
  • [20] Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies with PAC Bounds
    Ceren, Roi
    Doshi, Prashant
    Banerjee, Bikramjit
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 530 - 538