A state space filter for reinforcement learning in POMDPs - Application to a continuous state space -

被引:0
|
作者
Nagayoshi, Masato [1 ,2 ]
Murao, Hajime [3 ]
Tamaki, Hisashi [1 ]
机构
[1] Kobe Univ, Grad Sch Sci & Technol, Nada Ku, Kobe, Hyogo 6578501, Japan
[2] Hyogo Assistive Technol Res & Design Inst, Kobe 6512181, Japan
[3] Kobe Univ, Fac Cross Cultural Studies, Kobe 6578501, Japan
关键词
reinforcement learning; state space design; POMDPs; state space filtering; continuous state space; entropy;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a technique to deal with both discrete and continuous state space systems in POMDPs for reinforcement learning while keeping the state space of an agent compact. First our computational model for MDP environments, where a concept of "state space filtering" has been introduced and constructed to make properly the state space of an agent smaller by referring to "entropy" calculated based on the state-action mapping, is extended to be applicable in POMDP environments by introducing the mechanism of utilizing effectively of history information. Then, it is possible to deal with a continuous state space as well as a discrete state space. Here, the mechanism of adjusting the amount of history information is also introduced so that the state space of an agent should be compact. Moreover, some computational experiments with a robot navigation problem with a continuous state space have been carried out. The potential and the effectiveness of the extended approach have been confirmed through these experiments.
引用
收藏
页码:3098 / +
页数:2
相关论文
共 50 条
  • [1] Budgeted Reinforcement Learning in Continuous State Space
    Carrara, Nicolas
    Leurent, Edouard
    Laroche, Romain
    Urvoy, Tanguy
    Maillard, Odalric-Ambrym
    Pietquin, Olivier
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [2] Tree based discretization for continuous state space reinforcement learning
    Uther, WTB
    Veloso, MM
    [J]. FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 769 - 774
  • [3] Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees
    Dexter, Gregory
    Bello, Kevin
    Honorio, Jean
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [4] On the Convergence of Reinforcement Learning in Nonlinear Continuous State Space Problems
    Goyal, Raman
    Chakravorty, Suman
    Wang, Ran
    Mohamed, Mohamed Naveed Gul
    [J]. 2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 2969 - 2975
  • [5] Application of Reinforcement Learning with Continuous State Space to Ramp Metering in Real-world Conditions
    Rezaee, Kasra
    Abdulhai, Baher
    Abdelgawad, Hossam
    [J]. 2012 15TH INTERNATIONAL IEEE CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2012, : 1590 - 1595
  • [6] Adaptive state space formation in reinforcement learning
    Samejima, K
    Omori, T
    [J]. ICONIP'98: THE FIFTH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING JOINTLY WITH JNNS'98: THE 1998 ANNUAL CONFERENCE OF THE JAPANESE NEURAL NETWORK SOCIETY - PROCEEDINGS, VOLS 1-3, 1998, : 251 - 255
  • [7] A reinforcement learning accelerated by state space reduction
    Senda, K
    Mano, S
    Fujii, S
    [J]. SICE 2003 ANNUAL CONFERENCE, VOLS 1-3, 2003, : 1992 - 1997
  • [8] Adaptive state space partitioning for reinforcement learning
    Lee, ISK
    Lau, HYK
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2004, 17 (06) : 577 - 588
  • [9] BEHAVIOR ACQUISITION ON A MOBILE ROBOT USING REINFORCEMENT LEARNING WITH CONTINUOUS STATE SPACE
    Arai, Tomoyuki
    Toda, Yuichiro
    Kubota, Naoyuki
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2019, : 458 - 461
  • [10] Reinforcement Learning Method for Continuous State Space Based on Dynamic Neural Network
    Sun, Wei
    Wang, Xuesong
    Cheng, Yuhu
    [J]. 2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 750 - 754