Continuous Control with a Combination of Supervised and Reinforcement Learning

被引：0

作者：

Kangin, Dmitry ^{[1
]}

Pugeault, Nicolas ^{[1
]}

机构：

[1] Univ Exeter, Comp Sci Dept, Exeter EX4 4QF, Devon, England

来源：

2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2018年

基金：

英国工程与自然科学研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning methods have recently achieved impressive results on a wide range of control problems. However, especially with complex inputs, they still require an extensive amount of training data in order to converge to a meaningful solution. This limits their applicability to complex input spaces such as video signals, and makes them impractical for use in complex real world problems, including many of those for video based control. Supervised learning, on the contrary, is capable of learning on a relatively limited number of samples, but relies on arbitrary hand-labelling of data rather than task-derived reward functions, and hence do not yield independent control policies. In this article we propose a novel, model-free approach, which uses a combination of reinforcement and supervised learning for autonomous control and paves the way towards policy based control in real world environments. We use SpeedDreams/TORCS video game to demonstrate that our approach requires much less samples (hundreds of thousands against millions or tens of millions) comparing to the state-of-the-art reinforcement learning techniques on similar data, and at the same time overcomes both supervised and reinforcement learning approaches in terms of quality. Additionally, we demonstrate applicability of the method to MuJoCo control problems.

引用

页码：163 / 170

页数：8

共 50 条

[1] Personalized vital signs control based on continuous action-space reinforcement learning with supervised experience
Sun, Chenxi
Hong, Shenda
Song, Moxian
Shang, Junyuan
Li, Hongyan
[J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 69
[2] Reinforcement learning for continuous stochastic control problems
Munos, R
Bourgine, P
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 10, 1998, 10 : 1029 - 1035
[3] Competitive reinforcement learning in continuous control tasks
Abramson, M
Pachowicz, P
Wechsler, H
[J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 1909 - 1914
[4] Benchmarking Deep Reinforcement Learning for Continuous Control
Duan, Yan
Chen, Xi
Houthooft, Rein
Schulman, John
Abbeel, Pieter
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
[5] Learning Continuous Control Actions for Robotic Grasping with Reinforcement Learning
Shahid, Asad Ali
Roveda, Loris
Piga, Dario
Braghin, Francesco
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 4066 - 4072
[6] A study of the combination of LQR control and PILCO reinforcement learning
Yoo, Jae Hyun
[J]. Journal of Institute of Control, Robotics and Systems, 2019, 25 (10) : 891 - 895
[7] Autonomous Surface Craft Continuous Control with Reinforcement Learning
Andrey, Sorokin
Ogli, Farkhadov Mais Pasha
[J]. 2021 IEEE 15TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT2021), 2021,
[8] A Tour of Reinforcement Learning: The View from Continuous Control
Recht, Benjamin
[J]. ANNUAL REVIEW OF CONTROL, ROBOTICS, AND AUTONOMOUS SYSTEMS, VOL 2, 2019, 2 : 253 - 279
[9] Continuous control of a polymerization system with deep reinforcement learning
Ma, Yan
Zhu, Wenbo
Benton, Michael G.
Romagnoli, Jose
[J]. JOURNAL OF PROCESS CONTROL, 2019, 75 : 40 - 47
[10] Hierarchical Deep Reinforcement Learning for Continuous Action Control
Yang, Zhaoyang
Merrick, Kathryn
Jin, Lianwen
Abbass, Hussein A.
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (11) : 5174 - 5184

← 1 2 3 4 5 →