Discrete-to-deep reinforcement learning methods

被引:1
|
作者
Kurniawan, Budi [1 ]
Vamplew, Peter [1 ]
Papasimeon, Michael [2 ]
Dazeley, Richard [3 ]
Foale, Cameron [1 ]
机构
[1] Federat Univ, Mt Helen, Vic 3350, Australia
[2] Def Sci & Technol Grp, Fishermans Bend, Vic 3207, Australia
[3] Deakin Univ, Sch Informat Technol, Geelong, Vic 3220, Australia
来源
NEURAL COMPUTING & APPLICATIONS | 2022年 / 34卷 / 03期
关键词
Reinforcement learning; Neural network; Actor-critic; Supervised learning; DQN;
D O I
10.1007/s00521-021-06270-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural networks are effective function approximators, but hard to train in the reinforcement learning (RL) context mainly because samples are correlated. In complex problems, a neural RL approach is often able to learn a better solution than tabular RL, but generally takes longer. This paper proposes two methods, Discrete-to-Deep Supervised Policy Learning (D2D-SPL) and Discrete-to-Deep Supervised Q-value Learning (D2D-SQL), whose objective is to acquire the generalisability of a neural network at a cost nearer to that of a tabular method. Both methods combine RL and supervised learning (SL) and are based on the idea that a fast-learning tabular method can generate off-policy data to accelerate learning in neural RL. D2D-SPL uses the data to train a classifier which is then used as a controller for the RL problem. D2D-SQL uses the data to initialise a neural network which is then allowed to continue learning using another RL method. We demonstrate the viability of our algorithms with Cartpole, Lunar Lander and an aircraft manoeuvring problem, three continuous-space environments with low-dimensional state variables. Both methods learn at least 38% faster than baseline methods and yield policies that outperform them.
引用
收藏
页码:1713 / 1733
页数:21
相关论文
共 50 条
  • [41] A Survey on Deep Reinforcement Learning
    Liu Q.
    Zhai J.-W.
    Zhang Z.-Z.
    Zhong S.
    Zhou Q.
    Zhang P.
    Xu J.
    [J]. 2018, Science Press (41): : 1 - 27
  • [42] Implementation of Deep Reinforcement Learning
    Li, Meng-Jhe
    Li, An-Hong
    Huang, Yu-Jung
    Chu, Shao-I
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND SYSTEMS (ICISS 2019), 2019, : 232 - 236
  • [43] Deep reinforcement learning: a survey
    Wang, Hao-nan
    Liu, Ning
    Zhang, Yi-yun
    Feng, Da-wei
    Huang, Feng
    Li, Dong-sheng
    Zhang, Yi-ming
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2020, 21 (12) : 1726 - 1744
  • [44] Deep reinforcement learning: a survey
    Hao-nan Wang
    Ning Liu
    Yi-yun Zhang
    Da-wei Feng
    Feng Huang
    Dong-sheng Li
    Yi-ming Zhang
    [J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21 : 1726 - 1744
  • [45] Deep Reinforcement Learning and Games
    Zhao, Dongbin
    Lucas, Simon
    Togelius, Julian
    [J]. IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2019, 14 (03) : 7 - 7
  • [46] Deep Ordinal Reinforcement Learning
    Zap, Alexander
    Joppen, Tobias
    Furnkranz, Johannes
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT III, 2020, 11908 : 3 - 18
  • [47] Deep Reinforcement Learning that Matters
    Henderson, Peter
    Islam, Riashat
    Bachman, Philip
    Pineau, Joelle
    Precup, Doina
    Meger, David
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3207 - 3214
  • [48] Double Deep Reinforcement Learning
    Kiefer, Josue
    Dorer, Klaus
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS, ICARSC, 2023, : 17 - 22
  • [49] The neurobiology of deep reinforcement learning
    Gershman, Samuel J.
    Olveczky, Bence P.
    [J]. CURRENT BIOLOGY, 2020, 30 (11) : R629 - R632
  • [50] Coevolutionary Deep Reinforcement Learning
    Cotton, David
    Traish, Jason
    Chaczko, Zenon
    [J]. 2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 2600 - 2607