Competitive reinforcement learning in continuous control tasks

被引：0

作者：

Abramson, M ^{[1
]}

Pachowicz, P ^{[1
]}

Wechsler, H ^{[1
]}

机构：

[1] George Mason Univ, Fairfax, VA 22030 USA

来源：

PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4 | 2003年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes a novel hybrid reinforcement learning algorithm, Sarsa Learning Vector Quantization (SLVQ), that leaves the reinforcement part intact but employs a more effective representation of the policy function using,a piecewise constant function based upon "policy prototypes." The prototypes correspond to the pattern classes induced by the Voronoi tessellation generated by self-organizing methods like Learning Vector Quantization (LVQ). The determination of the optimal policy function can be now viewed as a pattern recognition problem in the sense that the assignment of an action to a point in the phase space is similar to the assignment of a pattern class to a point in phase space. The distributed LVQ representation of the policy function automatically generates a piecewise constant tessellation of the state space and yields in a major simplification of the learning task relative to the standard reinforcement learning algorithms for whom a discontinuous table look function, has to be learned. The feasibility and comparative advantages of the new algorithm is shown on the cart centering and mountain car problems, two control problems of increased difficulty.

引用

页码：1909 / 1914

页数：6

共 50 条

[31] Hierarchical reinforcement learning for kinematic control tasks with parameterized action spaces
Jingyu Cao
Lu Dong
Changyin Sun
Neural Computing and Applications, 2024, 36 : 323 - 336
[32] A Review of Reinforcement Learning for Fixed-Wing Aircraft Control Tasks
Richter, David J.
Calix, Ricardo A.
Kim, Kyungbaek
IEEE ACCESS, 2024, 12 : 103026 - 103048
[33] Continuous-time on-policy neural reinforcement learning of working memory tasks
Zambrano, Davide
Roelfsema, Pieter R.
Bohte, Sander M.
2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
[34] Continuous-Time Spike-Based Reinforcement Learning for Working Memory Tasks
Karamanis, Marios
Zambrano, Davide
Bohte, Sander
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT II, 2018, 11140 : 250 - 262
[35] Continuous Control of an Underground Loader Using Deep Reinforcement Learning
Backman, Sofi
Lindmark, Daniel
Bodin, Kenneth
Servin, Martin
Mork, Joakim
Lofgren, Hakan
MACHINES, 2021, 9 (10)
[36] Reinforcement Learning in Continuous Time and Space: A Stochastic Control Approach
Wang, Haoran
Zariphopoulou, Thaleia
Zhou, Xun Yu
JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
[37] Continuous action reinforcement learning applied to vehicle suspension control
Howell, MN
Frost, GP
Gordon, TJ
Wu, QH
MECHATRONICS, 1997, 7 (03) : 263 - 276
[38] Continuous Control of Complex Chemical Reaction Network with Reinforcement Learning
Alhazmi, Khalid
Sarathy, S. Mani
2020 EUROPEAN CONTROL CONFERENCE (ECC 2020), 2020, : 1066 - 1068
[39] Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics
Neunert, Michael
Abdolmaleki, Abbas
Lampe, Thomas
Springenberg, Tobias
Hafner, Roland
Romano, Francesco
Buchli, Jonas
Heess, Nicolas
Riedmiller, Martin
CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
[40] Reinforcement Learning with Reference Tracking Control in Continuous State Spaces
Hall, Joseph
Rasmussen, Carl Edward
Maciejowski, Jan
2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 6019 - 6024

← 1 2 3 4 5 →