Partially Observable Markov Decision Processes: A Geometric Technique and Analysis

被引:26
|
作者
Zhang, Hao [1 ]
机构
[1] Univ So Calif, Marshall Sch Business, Los Angeles, CA 90089 USA
关键词
MINKOWSKI ADDITION; INFINITE-HORIZON; QUALITY-CONTROL; POLICY; COMPLEXITY;
D O I
10.1287/opre.1090.0697
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper presents a novel framework for studying partially observable Markov decision processes (POMDPs) with finite state, action, observation sets, and discounted rewards. The new framework is solely based on future-reward vectors associated with future policies, which is more parsimonious than the traditional framework based on belief vectors. It reveals the connection between the POMDP problem and two computational geometry problems, i.e., finding the vertices of a convex hull and finding the Minkowski sum of convex polytopes, which can help solve the POMDP problem more efficiently. The new framework can clarify some existing algorithms over both finite and infinite horizons and shed new light on them. It also facilitates the comparison of POMDPs with respect to their degree of observability, as a useful structural result.
引用
收藏
页码:214 / 228
页数:15
相关论文
共 50 条
  • [31] A Fast Approximation Method for Partially Observable Markov Decision Processes
    LIU Bingbing
    KANG Yu
    JIANG Xiaofeng
    QIN Jiahu
    [J]. Journal of Systems Science & Complexity, 2018, 31 (06) : 1423 - 1436
  • [32] Stochastic optimization of controlled partially observable Markov decision processes
    Bartlett, PL
    Baxter, J
    [J]. PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 2000, : 124 - 129
  • [33] Active Chemical Sensing With Partially Observable Markov Decision Processes
    Gosangi, Rakesh
    Gutierrez-Osuna, Ricardo
    [J]. OLFACTION AND ELECTRONIC NOSE, PROCEEDINGS, 2009, 1137 : 562 - 565
  • [34] Learning factored representations for partially observable Markov decision processes
    Sallans, B
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1050 - 1056
  • [35] Partially Observable Risk-Sensitive Markov Decision Processes
    Baeuerle, Nicole
    Rieder, Ulrich
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 2017, 42 (04) : 1180 - 1196
  • [36] A Fast Approximation Method for Partially Observable Markov Decision Processes
    Liu Bingbing
    Kang Yu
    Jiang Xiaofeng
    Qin Jiahu
    [J]. JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2018, 31 (06) : 1423 - 1436
  • [37] Quasi-Deterministic Partially Observable Markov Decision Processes
    Besse, Camille
    Chaib-draa, Brahim
    [J]. NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2009, 5863 : 237 - 246
  • [38] Position Validation Strategies using Partially Observable Markov Decision Processes
    Kochenderfer, Mykel J.
    Shih, Kevin J.
    Chryssanthacopoulos, James P.
    Rose, Charles E.
    Elder, Tomas R.
    [J]. 2011 IEEE/AIAA 30TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC), 2011,
  • [39] Human-in-the-Loop Synthesis for Partially Observable Markov Decision Processes
    Carr, Steven
    Jansen, Nils
    Wimmer, Ralf
    Fu, Jie
    Topcu, Ufuk
    [J]. 2018 ANNUAL AMERICAN CONTROL CONFERENCE (ACC), 2018, : 762 - 769
  • [40] Approximate Linear Programming for Constrained Partially Observable Markov Decision Processes
    Poupart, Pascal
    Malhotra, Aarti
    Pei, Pei
    Kim, Kee-Eung
    Goh, Bongseok
    Bowling, Michael
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3342 - 3348