Partially Observable Markov Decision Processes: A Geometric Technique and Analysis

被引:26
|
作者
Zhang, Hao [1 ]
机构
[1] Univ So Calif, Marshall Sch Business, Los Angeles, CA 90089 USA
关键词
MINKOWSKI ADDITION; INFINITE-HORIZON; QUALITY-CONTROL; POLICY; COMPLEXITY;
D O I
10.1287/opre.1090.0697
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper presents a novel framework for studying partially observable Markov decision processes (POMDPs) with finite state, action, observation sets, and discounted rewards. The new framework is solely based on future-reward vectors associated with future policies, which is more parsimonious than the traditional framework based on belief vectors. It reveals the connection between the POMDP problem and two computational geometry problems, i.e., finding the vertices of a convex hull and finding the Minkowski sum of convex polytopes, which can help solve the POMDP problem more efficiently. The new framework can clarify some existing algorithms over both finite and infinite horizons and shed new light on them. It also facilitates the comparison of POMDPs with respect to their degree of observability, as a useful structural result.
引用
收藏
页码:214 / 228
页数:15
相关论文
共 50 条
  • [1] Qualitative Analysis of Partially-Observable Markov Decision Processes
    Chatterjee, Krishnendu
    Doyen, Laurent
    Henzinger, Thomas A.
    [J]. MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2010, 2010, 6281 : 258 - 269
  • [2] Partially Observable Markov Decision Processes and Performance Sensitivity Analysis
    Li, Yanjie
    Yin, Baoqun
    Xi, Hongsheng
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (06): : 1645 - 1651
  • [3] Partially Observable Markov Decision Processes and Robotics
    Kurniawati, Hanna
    [J]. ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 5 : 253 - 277
  • [4] A tutorial on partially observable Markov decision processes
    Littman, Michael L.
    [J]. JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2009, 53 (03) : 119 - 125
  • [5] Quantum partially observable Markov decision processes
    Barry, Jennifer
    Barry, Daniel T.
    Aaronson, Scott
    [J]. PHYSICAL REVIEW A, 2014, 90 (03):
  • [6] PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH PARTIALLY OBSERVABLE RANDOM DISCOUNT FACTORS
    Martinez-Garcia, E. Everardo
    Minjarez-Sosa, J. Adolfo
    Vega-Amaya, Oscar
    [J]. KYBERNETIKA, 2022, 58 (06) : 960 - 983
  • [7] Active learning in partially observable Markov decision processes
    Jaulmes, R
    Pineau, J
    Precup, D
    [J]. MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 601 - 608
  • [8] Structural Estimation of Partially Observable Markov Decision Processes
    Chang, Yanling
    Garcia, Alfredo
    Wang, Zhide
    Sun, Lu
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (08) : 5135 - 5141
  • [9] Entropy Maximization for Partially Observable Markov Decision Processes
    Savas, Yagiz
    Hibbard, Michael
    Wu, Bo
    Tanaka, Takashi
    Topcu, Ufuk
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (12) : 6948 - 6955
  • [10] Nonapproximability results for partially observable Markov decision processes
    Lusena, C
    Goldsmith, J
    Mundhenk, M
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 14 : 83 - 113