A temporal-difference learning method using gaussian state representation for continuous state space problems

被引:0
|
作者
机构
[1] Fujii, Natsuko
[2] Ueno, Atsushi
[3] Takubo, Tomohito
来源
| 1600年 / Japanese Society for Artificial Intelligence卷 / 29期
关键词
Learning algorithms - Gaussian distribution;
D O I
10.1527/tjsai.29.157
中图分类号
学科分类号
摘要
In this paper, we tackle the problem of reinforcement learning (RL) in a continuous state space. An appropriate discretization of the space can make many learning tasks tractable. A method using Gaussian state representation and the Rational Policy Making algorithm (RPM) has been proposed for this problem. This method discretizes the space by constructing a chain of states which represents a path to the goal of the agent exploiting past experiences of reaching it. This method exploits successful experiences strongly. Therefore, it can find a rational solution quickly in an environment with few noises. In a noisy environment, it makes many unnecessary and distractive states and does the task poorly. For learning in such an environment, we have introduced the concept of the value of a state to the above method and developed a new method. This method uses a temporal-difference (TD) learning algorithm for learning the values of states. The value of a state is used to determine the size of the state. Thus, our developed method can trim and eliminate unnecessary and distractive states quickly and learn the task well even in a noisy environment. We show the effectiveness of our method by computer simulations of a path finding task and a cart-pole swing-up task. © The Japanese Society for Artificial Intelligence 2014.
引用
收藏
相关论文
共 50 条
  • [31] Learning Driver Behaviors Using A Gaussian Process Augmented State-Space Model
    Kullberg, Anton
    Skog, Isaac
    Hendeby, Gustaf
    PROCEEDINGS OF 2020 23RD INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2020), 2020, : 530 - 536
  • [32] Localized active learning of Gaussian process state space models
    Capone, Alexandre
    Noske, Gerrit
    Umlauft, Jonas
    Beckers, Thomas
    Lederer, Armin
    Hirche, Sandra
    LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 490 - 499
  • [33] BEHAVIOR ACQUISITION ON A MOBILE ROBOT USING REINFORCEMENT LEARNING WITH CONTINUOUS STATE SPACE
    Arai, Tomoyuki
    Toda, Yuichiro
    Kubota, Naoyuki
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2019, : 458 - 461
  • [34] An optimal strategy learning for RoboCup in continuous state space
    Tao Junyuan
    Li Desheng
    IEEE ICMA 2006: PROCEEDING OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS 1-3, PROCEEDINGS, 2006, : 301 - +
  • [35] Continuous valued Q-learning method able to incrementally refine state space
    Takeda, M
    Nakamura, T
    Ogasawara, T
    IROS 2001: PROCEEDINGS OF THE 2001 IEEE/RJS INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4: EXPANDING THE SOCIETAL ROLE OF ROBOTICS IN THE NEXT MILLENNIUM, 2001, : 265 - 271
  • [36] Robust Reinforcement Learning Technique with Bigeminal Representation of Continuous State Space for Multi-Robot Systems
    Yasuda, Toshiyuki
    Kage, Koki
    Ohkura, Kazuhiro
    2012 PROCEEDINGS OF SICE ANNUAL CONFERENCE (SICE), 2012, : 1552 - 1557
  • [37] Equations of State for Simple Liquids from the Gaussian Equivalent Representation Method
    Bolmatov, Dima
    JOURNAL OF STATISTICAL PHYSICS, 2009, 137 (04) : 765 - 773
  • [38] Equations of State for Simple Liquids from the Gaussian Equivalent Representation Method
    Dima Bolmatov
    Journal of Statistical Physics, 2009, 137 : 765 - 773
  • [39] A state space compression method based on multivariate analysis for reinforcement learning in high-dimensional continuous state spaces
    Satoh, Hideki
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2006, E89A (08): : 2181 - 2191
  • [40] Policy Evaluation and Temporal–Difference Learning in Continuous Time and Space: A Martingale Approach
    Jia, Yanwei
    Zhou, Xun Yu
    Journal of Machine Learning Research, 2022, 23