Reinforcement learning with augmented states in partially expectation and action observable environment

被引:0
|
作者
Guirnaldo, SA [1 ]
Watanabe, K [1 ]
Izumi, K [1 ]
Kiguchi, K [1 ]
机构
[1] Saga Univ, Fac Engn Syst & Technol, Grad Sch Sci & Engn, Saga 8408502, Japan
关键词
partially observable Markov decision processes; expectation; reinforcement learning; perception; perceptual aliasing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of developing good or optimal policies for partially observable Markov decision processes (POMDP) remains one of the most alluring areas of research in artificial intelligence. Encourage by the way how we (humans) form expectations from past experiences and how our decisions and behaviour are affected with our expectations, this paper proposes a method called expectation and action augmented states (EAAS) in reinforcement learning aimed to discover good or near optimal policies in partially observable environment. The method uses the concept of expectation to give distinction between aliased states. It works by augmenting the agent's observation with its expectation of that observation. Two problems from the literature were used to test the proposed method. The results show promising characteristics of the method as compared to some methods currently being used in this domain.
引用
收藏
页码:823 / 828
页数:6
相关论文
共 50 条
  • [21] Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning
    Doshi-Velez, Finale
    Pfau, David
    Wood, Frank
    Roy, Nicholas
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (02) : 394 - 407
  • [22] Sequential Generative Exploration Model for Partially Observable Reinforcement Learning
    Yin, Haiyan
    Chen, Jianda
    Pan, Sinno Jialin
    Tschiatschek, Sebastian
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10700 - 10708
  • [23] Reinforcement learning algorithm for partially observable Markov decision processes
    Wang, Xue-Ning
    He, Han-Gen
    Xu, Xin
    Kongzhi yu Juece/Control and Decision, 2004, 19 (11): : 1263 - 1266
  • [24] Disturbance Observable Reinforcement Learning that Compensates for Changes in Environment
    Kim, SeongIn
    Shibuya, Takeshi
    2022 61ST ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS (SICE), 2022, : 141 - 145
  • [25] Partially Observable Hierarchical Reinforcement Learning with AI Planning (Student Abstract)
    Rozek, Brandon
    Lee, Junkyu
    Kokel, Harsha
    Katz, Michael
    Sohrabi, Shirin
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23635 - 23636
  • [26] Global Linear Convergence of Online Reinforcement Learning for Partially Observable Systems
    Hirai, Takumi
    Sadamoto, Tomonori
    2022 EUROPEAN CONTROL CONFERENCE (ECC), 2022, : 1566 - 1571
  • [27] Multi-task Reinforcement Learning in Partially Observable Stochastic Environments
    Li, Hui
    Liao, Xuejun
    Carin, Lawrence
    JOURNAL OF MACHINE LEARNING RESEARCH, 2009, 10 : 1131 - 1186
  • [28] Reinforcement Learning based on MPC/MHE for Unmodeled and Partially Observable Dynamics
    Esfahani, Hossein Nejatbakhsh
    Kordabad, Arash Bahari
    Gros, Sebastien
    2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 2121 - 2126
  • [29] Toward Generalization of Automated Temporal Abstraction to Partially Observable Reinforcement Learning
    Cilden, Erkin
    Polat, Faruk
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (08) : 1414 - 1425
  • [30] Multi-task reinforcement learning in partially observable stochastic environments
    Li, Hui
    Liao, Xuejun
    Carin, Lawrence
    Journal of Machine Learning Research, 2009, 10 : 1131 - 1186