A temporal-difference learning method using gaussian state representation for continuous state space problems

被引：0

作者：

机构：

[1] Fujii, Natsuko

[2] Ueno, Atsushi

[3] Takubo, Tomohito

来源：

| 1600年 / Japanese Society for Artificial Intelligence卷 / 29期

关键词：

Learning algorithms - Gaussian distribution;

D O I：

10.1527/tjsai.29.157

中图分类号：

学科分类号：

摘要：

In this paper, we tackle the problem of reinforcement learning (RL) in a continuous state space. An appropriate discretization of the space can make many learning tasks tractable. A method using Gaussian state representation and the Rational Policy Making algorithm (RPM) has been proposed for this problem. This method discretizes the space by constructing a chain of states which represents a path to the goal of the agent exploiting past experiences of reaching it. This method exploits successful experiences strongly. Therefore, it can find a rational solution quickly in an environment with few noises. In a noisy environment, it makes many unnecessary and distractive states and does the task poorly. For learning in such an environment, we have introduced the concept of the value of a state to the above method and developed a new method. This method uses a temporal-difference (TD) learning algorithm for learning the values of states. The value of a state is used to determine the size of the state. Thus, our developed method can trim and eliminate unnecessary and distractive states quickly and learn the task well even in a noisy environment. We show the effectiveness of our method by computer simulations of a path finding task and a cart-pole swing-up task. © The Japanese Society for Artificial Intelligence 2014.

引用

共 50 条

[41] Quantum dynamics using a discretized coherent state representation: An adaptive phase space method
Andersson, LM
JOURNAL OF CHEMICAL PHYSICS, 2001, 115 (03): : 1158 - 1165
[42] Inferring in Circles: Active Inference in Continuous State Space Using Hierarchical Gaussian Filtering of Sufficient Statistics
Waade, Peter Thestrup
Mikus, Nace
Mathys, Christoph
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021, PT I, 2021, 1524 : 810 - 818
[43] State-space representation of vortex wakes by the method of lines
Celi, R
JOURNAL OF THE AMERICAN HELICOPTER SOCIETY, 2005, 50 (02) : 195 - 205
[44] Learning stochastically stable Gaussian process state-space models
Umlauft, Jonas
Hirche, Sandra
IFAC JOURNAL OF SYSTEMS AND CONTROL, 2020, 12
[45] Computationally Efficient Bayesian Learning of Gaussian Process State Space Models
Svensson, Andreas
Solin, Arno
Sarkka, Simo
Schon, Thomas B.
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 213 - 221
[46] FAST VARIATIONAL LEARNING IN STATE-SPACE GAUSSIAN PROCESS MODELS
Chang, Paul E.
Wilkinson, William J.
Khan, Mohammad Emtiyaz
Solin, Arno
PROCEEDINGS OF THE 2020 IEEE 30TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2020,
[47] Multiple Frequency Bands Temporal State Representation for Deep Reinforcement Learning
Wang, Che
Hu, Jifeng
Song, Fuhu
Huang, Jiao
Yang, Zixuan
Wang, Yusen
2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023, 2023, : 309 - 315
[48] Tree based discretization for continuous state space reinforcement learning
Uther, WTB
Veloso, MM
FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 769 - 774
[49] Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees
Dexter, Gregory
Bello, Kevin
Honorio, Jean
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[50] Fuzzy Q-learning in continuous state and action space
Xu M.-L.
Xu W.-B.
Journal of China Universities of Posts and Telecommunications, 2010, 17 (04): : 100 - 109

← 1 2 3 4 5 →