Reinforcement learning with augmented states in partially expectation and action observable environment

被引:0
|
作者
Guirnaldo, SA [1 ]
Watanabe, K [1 ]
Izumi, K [1 ]
Kiguchi, K [1 ]
机构
[1] Saga Univ, Fac Engn Syst & Technol, Grad Sch Sci & Engn, Saga 8408502, Japan
关键词
partially observable Markov decision processes; expectation; reinforcement learning; perception; perceptual aliasing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of developing good or optimal policies for partially observable Markov decision processes (POMDP) remains one of the most alluring areas of research in artificial intelligence. Encourage by the way how we (humans) form expectations from past experiences and how our decisions and behaviour are affected with our expectations, this paper proposes a method called expectation and action augmented states (EAAS) in reinforcement learning aimed to discover good or near optimal policies in partially observable environment. The method uses the concept of expectation to give distinction between aliased states. It works by augmenting the agent's observation with its expectation of that observation. Two problems from the literature were used to test the proposed method. The results show promising characteristics of the method as compared to some methods currently being used in this domain.
引用
收藏
页码:823 / 828
页数:6
相关论文
共 50 条
  • [1] Partially observable environment estimation with uplift inference for reinforcement learning based recommendation
    Shang, Wenjie
    Li, Qingyang
    Qin, Zhiwei
    Yu, Yang
    Meng, Yiping
    Ye, Jieping
    [J]. MACHINE LEARNING, 2021, 110 (09) : 2603 - 2640
  • [2] Partially observable environment estimation with uplift inference for reinforcement learning based recommendation
    Wenjie Shang
    Qingyang Li
    Zhiwei Qin
    Yang Yu
    Yiping Meng
    Jieping Ye
    [J]. Machine Learning, 2021, 110 : 2603 - 2640
  • [3] Learning Reward Machines for Partially Observable Reinforcement Learning
    Icarte, Rodrigo Toro
    Waldie, Ethan
    Klassen, Toryn Q.
    Valenzano, Richard
    Castro, Margarita P.
    McIlraith, Sheila A.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [4] Inverse reinforcement learning in partially observable environments
    Choi, Jaedeug
    Kim, Kee-Eung
    [J]. Journal of Machine Learning Research, 2011, 12 : 691 - 730
  • [5] Inverse Reinforcement Learning in Partially Observable Environments
    Choi, Jaedeug
    Kim, Kee-Eung
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2011, 12 : 691 - 730
  • [6] Inverse Reinforcement Learning in Partially Observable Environments
    Choi, Jaedeug
    Kim, Kee-Eung
    [J]. 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1028 - 1033
  • [7] Hierarchical Deep Reinforcement Learning for Multi-robot Cooperation in Partially Observable Environment
    Liang, Zhixuan
    Cao, Jiannong
    Lin, Wanyu
    Chen, Jinlin
    Xu, Huafeng
    [J]. 2021 IEEE THIRD INTERNATIONAL CONFERENCE ON COGNITIVE MACHINE INTELLIGENCE (COGMI 2021), 2021, : 272 - 281
  • [8] Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning
    Park, Giseung
    Choi, Sungho
    Sung, Youngchul
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 7941 - 7948
  • [9] Learning reward machines: A study in partially observable reinforcement learning 
    Icarte, Rodrigo Toro
    Klassen, Toryn Q.
    Valenzano, Richard
    Castro, Margarita P.
    Waldie, Ethan
    Mcilraith, Sheila A.
    [J]. ARTIFICIAL INTELLIGENCE, 2023, 323
  • [10] Partially Observable Reinforcement Learning for Sustainable Active Surveillance
    Chen, Hechang
    Yang, Bo
    Liu, Jiming
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2018, PT II, 2018, 11062 : 425 - 437