Active learning in partially observable Markov decision processes

被引:0
|
作者
Jaulmes, R [1 ]
Pineau, J [1 ]
Precup, D [1 ]
机构
[1] McGill Univ, Sch Comp Sci, Montreal, PQ H3A 2A7, Canada
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper examines the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly specified. We propose two approaches to this problem. The first relies on a model of the uncertainty that is added directly into the POMDP planning problem. This has theoretical guarantees, but is impractical when many of the parameters are uncertain. The second, called MEDUSA, incrementally improves the POMDP model using selected queries, while still optimizing reward. Results show good performance of the algorithm even in large problems: the most useful parameters of the model are learned quickly and the agent still accumulates high reward throughout the process.
引用
收藏
页码:601 / 608
页数:8
相关论文
共 50 条
  • [41] Equivalence Relations in Fully and Partially Observable Markov Decision Processes
    Castro, Pablo Samuel
    Panangaden, Prakash
    Precup, Doina
    [J]. 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1653 - 1658
  • [42] A Fast Approximation Method for Partially Observable Markov Decision Processes
    LIU Bingbing
    KANG Yu
    JIANG Xiaofeng
    QIN Jiahu
    [J]. Journal of Systems Science & Complexity, 2018, 31 (06) : 1423 - 1436
  • [43] Stochastic optimization of controlled partially observable Markov decision processes
    Bartlett, PL
    Baxter, J
    [J]. PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 2000, : 124 - 129
  • [44] Partially Observable Markov Decision Processes and Performance Sensitivity Analysis
    Li, Yanjie
    Yin, Baoqun
    Xi, Hongsheng
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (06): : 1645 - 1651
  • [45] Partially Observable Markov Decision Processes: A Geometric Technique and Analysis
    Zhang, Hao
    [J]. OPERATIONS RESEARCH, 2010, 58 (01) : 214 - 228
  • [46] Partially Observable Risk-Sensitive Markov Decision Processes
    Baeuerle, Nicole
    Rieder, Ulrich
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 2017, 42 (04) : 1180 - 1196
  • [47] A Fast Approximation Method for Partially Observable Markov Decision Processes
    Liu Bingbing
    Kang Yu
    Jiang Xiaofeng
    Qin Jiahu
    [J]. JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2018, 31 (06) : 1423 - 1436
  • [48] Quasi-Deterministic Partially Observable Markov Decision Processes
    Besse, Camille
    Chaib-draa, Brahim
    [J]. NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2009, 5863 : 237 - 246
  • [49] Learning Partially Observable Markov Decision Model with EM Algorithm
    Tan, Hui
    Ma, Shaohui
    [J]. 2013 7TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2013, : 54 - 57
  • [50] Mixed reinforcement learning for partially observable Markov decision process
    Dung, Le Tien
    Komeda, Takashi
    Takagi, Motoki
    [J]. 2007 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN ROBOTICS AND AUTOMATION, 2007, : 436 - +