A temporal-difference learning method using gaussian state representation for continuous state space problems

被引:0
|
作者
机构
[1] Fujii, Natsuko
[2] Ueno, Atsushi
[3] Takubo, Tomohito
来源
| 1600年 / Japanese Society for Artificial Intelligence卷 / 29期
关键词
Learning algorithms - Gaussian distribution;
D O I
10.1527/tjsai.29.157
中图分类号
学科分类号
摘要
In this paper, we tackle the problem of reinforcement learning (RL) in a continuous state space. An appropriate discretization of the space can make many learning tasks tractable. A method using Gaussian state representation and the Rational Policy Making algorithm (RPM) has been proposed for this problem. This method discretizes the space by constructing a chain of states which represents a path to the goal of the agent exploiting past experiences of reaching it. This method exploits successful experiences strongly. Therefore, it can find a rational solution quickly in an environment with few noises. In a noisy environment, it makes many unnecessary and distractive states and does the task poorly. For learning in such an environment, we have introduced the concept of the value of a state to the above method and developed a new method. This method uses a temporal-difference (TD) learning algorithm for learning the values of states. The value of a state is used to determine the size of the state. Thus, our developed method can trim and eliminate unnecessary and distractive states quickly and learn the task well even in a noisy environment. We show the effectiveness of our method by computer simulations of a path finding task and a cart-pole swing-up task. © The Japanese Society for Artificial Intelligence 2014.
引用
收藏
相关论文
共 50 条
  • [21] Particle swarm optimization based on temporal-difference learning for solving multi-objective optimization problems
    Desong Zhang
    Guangyu Zhu
    Computing, 2023, 105 : 1795 - 1820
  • [22] Budgeted Reinforcement Learning in Continuous State Space
    Carrara, Nicolas
    Leurent, Edouard
    Laroche, Romain
    Urvoy, Tanguy
    Maillard, Odalric-Ambrym
    Pietquin, Olivier
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [23] Adaptive Learning through Temporal Dynamics of State Representation
    Razmi, Niloufar
    Nassar, Matthew R.
    JOURNAL OF NEUROSCIENCE, 2022, 42 (12): : 2524 - 2538
  • [24] A MATRIX-FREE METHOD FOR SPATIAL-TEMPORAL GAUSSIAN STATE-SPACE MODELS
    Mondal, Debashis
    Wang, Chunxiao
    STATISTICA SINICA, 2019, 29 (04) : 2205 - 2227
  • [25] A Temporal Difference GNG-based Approach for the State Space Quantization in Reinforcement Learning Environments
    Vieira, Davi C. L.
    Adeodato, Paulo J. L.
    Goncalves, Paulo M., Jr.
    2013 IEEE 25TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2013, : 561 - 568
  • [26] Reinforcement Learning Method for Continuous State Space Based on Dynamic Neural Network
    Sun, Wei
    Wang, Xuesong
    Cheng, Yuhu
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 750 - 754
  • [27] Online maximization of extracted energy in sea wave energy converters using temporal-difference learning
    Khaleghi, Sadegh
    Moghaddam, Reihaneh Kardehi
    Sistani, Mohammadbagher Naghibi
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART M-JOURNAL OF ENGINEERING FOR THE MARITIME ENVIRONMENT, 2023, 237 (03) : 565 - 578
  • [28] A globally continuous state-space representation of switched networks
    Jatskevich, J
    Wasynczuk, O
    Walters, EA
    Lucas, CE
    2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 559 - 563
  • [29] Testing General Game Players Against a Simplified Boardgames Player Using Temporal-difference Learning
    Kowalski, Jakub
    Kisielewicz, Andrzej
    2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2015, : 1466 - 1473
  • [30] State space method for inverse spectral problems
    Alpay, D
    Gohberg, I
    SYSTEMS AND CONTROL IN THE TWENTY-FIRST CENTURY, 1997, 22 : 1 - 16