A temporal-difference learning method using gaussian state representation for continuous state space problems

被引:0
|
作者
机构
[1] Fujii, Natsuko
[2] Ueno, Atsushi
[3] Takubo, Tomohito
来源
| 1600年 / Japanese Society for Artificial Intelligence卷 / 29期
关键词
Learning algorithms - Gaussian distribution;
D O I
10.1527/tjsai.29.157
中图分类号
学科分类号
摘要
In this paper, we tackle the problem of reinforcement learning (RL) in a continuous state space. An appropriate discretization of the space can make many learning tasks tractable. A method using Gaussian state representation and the Rational Policy Making algorithm (RPM) has been proposed for this problem. This method discretizes the space by constructing a chain of states which represents a path to the goal of the agent exploiting past experiences of reaching it. This method exploits successful experiences strongly. Therefore, it can find a rational solution quickly in an environment with few noises. In a noisy environment, it makes many unnecessary and distractive states and does the task poorly. For learning in such an environment, we have introduced the concept of the value of a state to the above method and developed a new method. This method uses a temporal-difference (TD) learning algorithm for learning the values of states. The value of a state is used to determine the size of the state. Thus, our developed method can trim and eliminate unnecessary and distractive states quickly and learn the task well even in a noisy environment. We show the effectiveness of our method by computer simulations of a path finding task and a cart-pole swing-up task. © The Japanese Society for Artificial Intelligence 2014.
引用
收藏
相关论文
共 50 条
  • [41] Quantum dynamics using a discretized coherent state representation: An adaptive phase space method
    Andersson, LM
    JOURNAL OF CHEMICAL PHYSICS, 2001, 115 (03): : 1158 - 1165
  • [42] Inferring in Circles: Active Inference in Continuous State Space Using Hierarchical Gaussian Filtering of Sufficient Statistics
    Waade, Peter Thestrup
    Mikus, Nace
    Mathys, Christoph
    MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021, PT I, 2021, 1524 : 810 - 818
  • [43] State-space representation of vortex wakes by the method of lines
    Celi, R
    JOURNAL OF THE AMERICAN HELICOPTER SOCIETY, 2005, 50 (02) : 195 - 205
  • [44] Learning stochastically stable Gaussian process state-space models
    Umlauft, Jonas
    Hirche, Sandra
    IFAC JOURNAL OF SYSTEMS AND CONTROL, 2020, 12
  • [45] Computationally Efficient Bayesian Learning of Gaussian Process State Space Models
    Svensson, Andreas
    Solin, Arno
    Sarkka, Simo
    Schon, Thomas B.
    ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 213 - 221
  • [46] FAST VARIATIONAL LEARNING IN STATE-SPACE GAUSSIAN PROCESS MODELS
    Chang, Paul E.
    Wilkinson, William J.
    Khan, Mohammad Emtiyaz
    Solin, Arno
    PROCEEDINGS OF THE 2020 IEEE 30TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2020,
  • [47] Multiple Frequency Bands Temporal State Representation for Deep Reinforcement Learning
    Wang, Che
    Hu, Jifeng
    Song, Fuhu
    Huang, Jiao
    Yang, Zixuan
    Wang, Yusen
    2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023, 2023, : 309 - 315
  • [48] Tree based discretization for continuous state space reinforcement learning
    Uther, WTB
    Veloso, MM
    FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 769 - 774
  • [49] Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees
    Dexter, Gregory
    Bello, Kevin
    Honorio, Jean
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [50] Fuzzy Q-learning in continuous state and action space
    Xu M.-L.
    Xu W.-B.
    Journal of China Universities of Posts and Telecommunications, 2010, 17 (04): : 100 - 109