Learning Factored Markov Decision Processes with Unawareness

被引:0
|
作者
Innes, Craig [1 ]
Lascarides, Alex [1 ]
机构
[1] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Midlothian, Scotland
基金
英国工程与自然科学研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Methods for learning and planning in sequential decision problems often assume the learner is aware of all possible states and actions in advance. This assumption is sometimes untenable. In this paper, we give a method to learn factored markov decision problems from both domain exploration and expert assistance, which guarantees convergence to near-optimal behaviour, even when the agent begins unaware of factors critical to success. Our experiments show our agent learns optimal behaviour on small and large problems, and that conserving information on discovering new possibilities results in faster convergence.
引用
收藏
页码:123 / 133
页数:11
相关论文
共 50 条
  • [1] Learning Factored Markov Decision Processes with Unawareness
    Innes, Craig
    Lascarides, Alex
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2030 - 2032
  • [2] Learning factored representations for partially observable Markov decision processes
    Sallans, B
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1050 - 1056
  • [3] Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes
    Tian, Yi
    Qian, Jian
    Sra, Suvrit
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [4] Finding good stochastic factored policies for factored Markov decision processes
    Radoszycki, Julia
    Peyrard, Nathalie
    Sabbadin, Regis
    [J]. 21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1083 - 1084
  • [5] Symbolic heuristic search for factored Markov decision processes
    Feng, ZZ
    Hansen, EA
    [J]. EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 455 - 460
  • [6] On Querying for Safe Optimality in Factored Markov Decision Processes
    Zhang, Shun
    Durfee, Edmund H.
    Singh, Satinder
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), 2018, : 2168 - 2170
  • [7] An Analytic Characterization of Model Minimization in Factored Markov Decision Processes
    Guo, Wenyuan
    Leong, Tze-Yun
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10), 2010, : 1077 - 1082
  • [8] Representing and Solving Factored Markov Decision Processes with Imprecise Probabilities
    Delgado, Karina Valdivia
    de Barros, Leliane Nunes
    Cozman, Fabio Gagliardi
    Shirota, Ricardo
    [J]. ISIPTA '09: PROCEEDINGS OF THE SIXTH INTERNATIONAL SYMPOSIUM ON IMPRECISE PROBABILITY: THEORIES AND APPLICATIONS, 2009, : 169 - +
  • [9] Reconfigurable Digital Channelizer Design Using Factored Markov Decision Processes
    Adrian Sapio
    Lin Li
    Jiahao Wu
    Marilyn Wolf
    Shuvra S. Bhattacharyya
    [J]. Journal of Signal Processing Systems, 2018, 90 : 1329 - 1343
  • [10] On Confident Policy Evaluation for Factored Markov Decision Processes with Node Dropouts
    Fiscko, Carmel
    Kar, Soummya
    Sinopoli, Bruno
    [J]. 2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 2857 - 2863