A primer on partially observable Markov decision processes (POMDPs)

被引:16
|
作者
Chades, Iadine [1 ]
Pascal, Luz V. [2 ]
Nicol, Sam [1 ]
Fletcher, Cameron S. [3 ]
Ferrer-Mestres, Jonathan [1 ]
机构
[1] CSIRO, Dutton Pk, Qld, Australia
[2] ENSTA, Palaiseau, France
[3] CSIRO, Atherton, Qld, Australia
来源
METHODS IN ECOLOGY AND EVOLUTION | 2021年 / 12卷 / 11期
关键词
AI; decisions; partially observable Markov decision processes; stochastic dynamic programming; uncertainty; ADAPTIVE MANAGEMENT; ECOLOGICAL-SYSTEMS; TRANSITION MODELS; VALUE-ITERATION; STATE; UNCERTAINTY;
D O I
10.1111/2041-210X.13692
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Partially observable Markov decision processes (POMDPs) are a convenient mathematical model to solve sequential decision-making problems under imperfect observations. Most notably for ecologists, POMDPs have helped solve the trade-offs between investing in management or surveillance and, more recently, to optimise adaptive management problems. Despite an increasing number of applications in ecology and natural resources, POMDPs are still poorly understood. The complexity of the mathematics, the inaccessibility of POMDP solvers developed by the Artificial Intelligence (AI) community, and the lack of introductory material are likely reasons for this. We propose to bridge this gap by providing a primer on POMDPs, a typology of case studies drawn from the literature, and a repository of POMDP problems. We explain the steps required to define a POMDP when the state of the system is imperfectly detected (state uncertainty) and when the dynamics of the system are unknown (model uncertainty). We provide input files and solutions to a selected number of problems, reflect on lessons learned applying these models over the last 10 years and discuss future research required on interpretable AI. Partially observable Markov decision processes are powerful decision models that allow users to make decisions under imperfect observations over time. This primer will provide a much-needed entry point to ecologists.
引用
收藏
页码:2058 / 2072
页数:15
相关论文
共 50 条
  • [1] Partially Observable Markov Decision Processes (POMDPs) and Wireless Body Area Networks (WBAN): A Survey
    Mohammed, Yahaya O.
    Baroudi, Uthman A.
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2013, 7 (05): : 1036 - 1057
  • [2] Evolving Policies for Multi-Reward Partially Observable Markov Decision Processes (MR-POMDPs)
    Soh, Harold
    Demiris, Yiannis
    [J]. GECCO-2011: PROCEEDINGS OF THE 13TH ANNUAL GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2011, : 713 - 720
  • [3] Partially Observable Markov Decision Processes and Robotics
    Kurniawati, Hanna
    [J]. ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 5 : 253 - 277
  • [4] A tutorial on partially observable Markov decision processes
    Littman, Michael L.
    [J]. JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2009, 53 (03) : 119 - 125
  • [5] Quantum partially observable Markov decision processes
    Barry, Jennifer
    Barry, Daniel T.
    Aaronson, Scott
    [J]. PHYSICAL REVIEW A, 2014, 90 (03):
  • [6] PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH PARTIALLY OBSERVABLE RANDOM DISCOUNT FACTORS
    Martinez-Garcia, E. Everardo
    Minjarez-Sosa, J. Adolfo
    Vega-Amaya, Oscar
    [J]. KYBERNETIKA, 2022, 58 (06) : 960 - 983
  • [7] Active learning in partially observable Markov decision processes
    Jaulmes, R
    Pineau, J
    Precup, D
    [J]. MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 601 - 608
  • [8] Structural Estimation of Partially Observable Markov Decision Processes
    Chang, Yanling
    Garcia, Alfredo
    Wang, Zhide
    Sun, Lu
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (08) : 5135 - 5141
  • [9] Entropy Maximization for Partially Observable Markov Decision Processes
    Savas, Yagiz
    Hibbard, Michael
    Wu, Bo
    Tanaka, Takashi
    Topcu, Ufuk
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (12) : 6948 - 6955
  • [10] Nonapproximability results for partially observable Markov decision processes
    Lusena, C
    Goldsmith, J
    Mundhenk, M
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 14 : 83 - 113