Discrete-to-deep reinforcement learning methods

被引：1

作者：

Kurniawan, Budi ^{[1
]}

Vamplew, Peter ^{[1
]}

Papasimeon, Michael ^{[2
]}

Dazeley, Richard ^{[3
]}

Foale, Cameron ^{[1
]}

机构：

[1] Federat Univ, Mt Helen, Vic 3350, Australia

[2] Def Sci & Technol Grp, Fishermans Bend, Vic 3207, Australia

[3] Deakin Univ, Sch Informat Technol, Geelong, Vic 3220, Australia

来源：

NEURAL COMPUTING & APPLICATIONS | 2022年 / 34卷 / 03期

关键词：

Reinforcement learning; Neural network; Actor-critic; Supervised learning; DQN;

D O I：

10.1007/s00521-021-06270-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural networks are effective function approximators, but hard to train in the reinforcement learning (RL) context mainly because samples are correlated. In complex problems, a neural RL approach is often able to learn a better solution than tabular RL, but generally takes longer. This paper proposes two methods, Discrete-to-Deep Supervised Policy Learning (D2D-SPL) and Discrete-to-Deep Supervised Q-value Learning (D2D-SQL), whose objective is to acquire the generalisability of a neural network at a cost nearer to that of a tabular method. Both methods combine RL and supervised learning (SL) and are based on the idea that a fast-learning tabular method can generate off-policy data to accelerate learning in neural RL. D2D-SPL uses the data to train a classifier which is then used as a controller for the RL problem. D2D-SQL uses the data to initialise a neural network which is then allowed to continue learning using another RL method. We demonstrate the viability of our algorithms with Cartpole, Lunar Lander and an aircraft manoeuvring problem, three continuous-space environments with low-dimensional state variables. Both methods learn at least 38% faster than baseline methods and yield policies that outperform them.

引用

页码：1713 / 1733

页数：21

共 50 条

[41] A Survey on Deep Reinforcement Learning
Liu Q.
Zhai J.-W.
Zhang Z.-Z.
Zhong S.
Zhou Q.
Zhang P.
Xu J.
[J]. 2018, Science Press (41): : 1 - 27
[42] Implementation of Deep Reinforcement Learning
Li, Meng-Jhe
Li, An-Hong
Huang, Yu-Jung
Chu, Shao-I
[J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND SYSTEMS (ICISS 2019), 2019, : 232 - 236
[43] Deep reinforcement learning: a survey
Wang, Hao-nan
Liu, Ning
Zhang, Yi-yun
Feng, Da-wei
Huang, Feng
Li, Dong-sheng
Zhang, Yi-ming
[J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2020, 21 (12) : 1726 - 1744
[44] Deep reinforcement learning: a survey
Hao-nan Wang
Ning Liu
Yi-yun Zhang
Da-wei Feng
Feng Huang
Dong-sheng Li
Yi-ming Zhang
[J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21 : 1726 - 1744
[45] Deep Reinforcement Learning and Games
Zhao, Dongbin
Lucas, Simon
Togelius, Julian
[J]. IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2019, 14 (03) : 7 - 7
[46] Deep Ordinal Reinforcement Learning
Zap, Alexander
Joppen, Tobias
Furnkranz, Johannes
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT III, 2020, 11908 : 3 - 18
[47] Deep Reinforcement Learning that Matters
Henderson, Peter
Islam, Riashat
Bachman, Philip
Pineau, Joelle
Precup, Doina
Meger, David
[J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3207 - 3214
[48] Double Deep Reinforcement Learning
Kiefer, Josue
Dorer, Klaus
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS, ICARSC, 2023, : 17 - 22
[49] The neurobiology of deep reinforcement learning
Gershman, Samuel J.
Olveczky, Bence P.
[J]. CURRENT BIOLOGY, 2020, 30 (11) : R629 - R632
[50] Coevolutionary Deep Reinforcement Learning
Cotton, David
Traish, Jason
Chaczko, Zenon
[J]. 2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 2600 - 2607

← 1 2 3 4 5 →