Active learning in partially observable Markov decision processes

被引:0
|
作者
Jaulmes, R [1 ]
Pineau, J [1 ]
Precup, D [1 ]
机构
[1] McGill Univ, Sch Comp Sci, Montreal, PQ H3A 2A7, Canada
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper examines the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly specified. We propose two approaches to this problem. The first relies on a model of the uncertainty that is added directly into the POMDP planning problem. This has theoretical guarantees, but is impractical when many of the parameters are uncertain. The second, called MEDUSA, incrementally improves the POMDP model using selected queries, while still optimizing reward. Results show good performance of the algorithm even in large problems: the most useful parameters of the model are learned quickly and the agent still accumulates high reward throughout the process.
引用
收藏
页码:601 / 608
页数:8
相关论文
共 50 条
  • [21] On Anderson Acceleration for Partially Observable Markov Decision Processes
    Ermis, Melike
    Park, Mingyu
    Yang, Insoon
    [J]. 2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 4478 - 4485
  • [22] Partially Observable Markov Decision Processes in Robotics: A Survey
    Lauri, Mikko
    Hsu, David
    Pajarinen, Joni
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2023, 39 (01) : 21 - 40
  • [23] A primer on partially observable Markov decision processes (POMDPs)
    Chades, Iadine
    Pascal, Luz V.
    Nicol, Sam
    Fletcher, Cameron S.
    Ferrer-Mestres, Jonathan
    [J]. METHODS IN ECOLOGY AND EVOLUTION, 2021, 12 (11): : 2058 - 2072
  • [24] Minimal Disclosure in Partially Observable Markov Decision Processes
    Bertrand, Nathalie
    Genest, Blaise
    [J]. IARCS ANNUAL CONFERENCE ON FOUNDATIONS OF SOFTWARE TECHNOLOGY AND THEORETICAL COMPUTER SCIENCE (FSTTCS 2011), 2011, 13 : 411 - 422
  • [25] Partially observable Markov decision processes with imprecise parameters
    Itoh, Hideaki
    Nakamura, Kiyohiko
    [J]. ARTIFICIAL INTELLIGENCE, 2007, 171 (8-9) : 453 - 490
  • [26] Nonapproximability results for partially observable Markov decision processes
    Lusena, Cristopher
    Goldsmith, Judy
    Mundhenk, Martin
    [J]. 1600, Morgan Kaufmann Publishers (14):
  • [27] Cost-Bounded Active Classification Using Partially Observable Markov Decision Processes
    Wu, Bo
    Ahmadi, Mohamadreza
    Bharadwaj, Suda
    Topcu, Ufuk
    [J]. 2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 1216 - 1223
  • [28] Optimizing active surveillance for prostate cancer using partially observable Markov decision processes
    Li, Weiyu
    Denton, Brian T.
    Morgan, Todd M.
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 305 (01) : 386 - 399
  • [29] THE PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES FRAMEWORK IN MEDICAL DECISION MAKING
    Goulionis, John E.
    Stengos, Dimitrios I.
    [J]. ADVANCES AND APPLICATIONS IN STATISTICS, 2008, 9 (02) : 205 - 232
  • [30] LEARNING PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES USING COUPLED CANONICAL POLYADIC DECOMPOSITION
    Huang, Kejun
    Yang, Zhuoran
    Wang, Zhaoran
    Hong, Mingyi
    [J]. 2019 IEEE DATA SCIENCE WORKSHOP (DSW), 2019, : 295 - 299