A primer on partially observable Markov decision processes (POMDPs)

被引：16

作者：

Chades, Iadine ^{[1
]}

Pascal, Luz V. ^{[2
]}

Nicol, Sam ^{[1
]}

Fletcher, Cameron S. ^{[3
]}

Ferrer-Mestres, Jonathan ^{[1
]}

机构：

[1] CSIRO, Dutton Pk, Qld, Australia

[2] ENSTA, Palaiseau, France

[3] CSIRO, Atherton, Qld, Australia

来源：

METHODS IN ECOLOGY AND EVOLUTION | 2021年 / 12卷 / 11期

关键词：

AI; decisions; partially observable Markov decision processes; stochastic dynamic programming; uncertainty; ADAPTIVE MANAGEMENT; ECOLOGICAL-SYSTEMS; TRANSITION MODELS; VALUE-ITERATION; STATE; UNCERTAINTY;

D O I：

10.1111/2041-210X.13692

中图分类号：

Q14 [生态学（生物生态学）];

学科分类号：

071012 ; 0713 ;

摘要：

Partially observable Markov decision processes (POMDPs) are a convenient mathematical model to solve sequential decision-making problems under imperfect observations. Most notably for ecologists, POMDPs have helped solve the trade-offs between investing in management or surveillance and, more recently, to optimise adaptive management problems. Despite an increasing number of applications in ecology and natural resources, POMDPs are still poorly understood. The complexity of the mathematics, the inaccessibility of POMDP solvers developed by the Artificial Intelligence (AI) community, and the lack of introductory material are likely reasons for this. We propose to bridge this gap by providing a primer on POMDPs, a typology of case studies drawn from the literature, and a repository of POMDP problems. We explain the steps required to define a POMDP when the state of the system is imperfectly detected (state uncertainty) and when the dynamics of the system are unknown (model uncertainty). We provide input files and solutions to a selected number of problems, reflect on lessons learned applying these models over the last 10 years and discuss future research required on interpretable AI. Partially observable Markov decision processes are powerful decision models that allow users to make decisions under imperfect observations over time. This primer will provide a much-needed entry point to ecologists.

引用

页码：2058 / 2072

页数：15

共 50 条

[1] Partially Observable Markov Decision Processes (POMDPs) and Wireless Body Area Networks (WBAN): A Survey
Mohammed, Yahaya O.
Baroudi, Uthman A.
[J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2013, 7 (05): : 1036 - 1057
[2] Evolving Policies for Multi-Reward Partially Observable Markov Decision Processes (MR-POMDPs)
Soh, Harold
Demiris, Yiannis
[J]. GECCO-2011: PROCEEDINGS OF THE 13TH ANNUAL GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2011, : 713 - 720
[3] Partially Observable Markov Decision Processes and Robotics
Kurniawati, Hanna
[J]. ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 5 : 253 - 277
[4] A tutorial on partially observable Markov decision processes
Littman, Michael L.
[J]. JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2009, 53 (03) : 119 - 125
[5] Quantum partially observable Markov decision processes
Barry, Jennifer
Barry, Daniel T.
Aaronson, Scott
[J]. PHYSICAL REVIEW A, 2014, 90 (03):
[6] PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH PARTIALLY OBSERVABLE RANDOM DISCOUNT FACTORS
Martinez-Garcia, E. Everardo
Minjarez-Sosa, J. Adolfo
Vega-Amaya, Oscar
[J]. KYBERNETIKA, 2022, 58 (06) : 960 - 983
[7] Active learning in partially observable Markov decision processes
Jaulmes, R
Pineau, J
Precup, D
[J]. MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 601 - 608
[8] Structural Estimation of Partially Observable Markov Decision Processes
Chang, Yanling
Garcia, Alfredo
Wang, Zhide
Sun, Lu
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (08) : 5135 - 5141
[9] Entropy Maximization for Partially Observable Markov Decision Processes
Savas, Yagiz
Hibbard, Michael
Wu, Bo
Tanaka, Takashi
Topcu, Ufuk
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (12) : 6948 - 6955
[10] Nonapproximability results for partially observable Markov decision processes
Lusena, C
Goldsmith, J
Mundhenk, M
[J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 14 : 83 - 113

← 1 2 3 4 5 →