Knowledge Transfer for Deep Reinforcement Learning with Hierarchical Experience Replay

被引:0
|
作者
Yin, Haiyan [1 ]
Pan, Sinno Jialin [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The process for transferring knowledge of multiple reinforcement learning policies into a single multi-task policy via distillation technique is known as policy distillation. When policy distillation is under a deep reinforcement learning setting, due to the giant parameter size and the huge state space for each task domain, it requires extensive computational efforts to train the multi-task policy network. In this paper, we propose a new policy distillation architecture for deep reinforcement learning, where we assume that each task uses its taskspecific high-level convolutional features as the inputs to the multi-task policy network. Furthermore, we propose a new sampling framework termed hierarchical prioritized experience replay to selectively choose experiences from the replay memories of each task domain to perform learning on the network. With the above two attempts, we aim to accelerate the learning of the multi-task policy network while guaranteeing a good performance. We use Atari 2600 games as testing environment to demonstrate the efficiency and effectiveness of our proposed solution for policy distillation.
引用
下载
收藏
页码:1640 / 1646
页数:7
相关论文
共 50 条
  • [31] Experience Replay for Real-Time Reinforcement Learning Control
    Adam, Sander
    Busoniu, Lucian
    Babuska, Robert
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (02): : 201 - 212
  • [32] Autonomous Reinforcement Learning with Experience Replay for Humanoid Gait Optimization
    Wawrzynski, Pawel
    PROCEEDINGS OF THE INTERNATIONAL NEURAL NETWORK SOCIETY WINTER CONFERENCE (INNS-WC2012), 2012, 13 : 205 - 211
  • [33] A New Reinforcement Learning Algorithm Based on Counterfactual Experience Replay
    Li Menglin
    Chen Jing
    Chen Shaofei
    Gao Wei
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 1994 - 2001
  • [34] An Experience Replay Method Based on Tree Structure for Reinforcement Learning
    Jiang, Wei-Cheng
    Hwang, Kao-Shing
    Lin, Jin-Ling
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2021, 9 (02) : 972 - 982
  • [35] Online EV charging controlled by reinforcement learning with experience replay
    Poddubnyy, Andrey
    Nguyen, Phuong
    Slootweg, Han
    SUSTAINABLE ENERGY GRIDS & NETWORKS, 2023, 36
  • [36] Proxy Experience Replay: Federated Distillation for Distributed Reinforcement Learning
    Cha, Han
    Park, Jihong
    Kim, Hyesung
    Bennis, Mehdi
    Kim, Seong-Lyun
    IEEE INTELLIGENT SYSTEMS, 2020, 35 (04) : 94 - 101
  • [37] HCS-R-HER: Hierarchical reinforcement learning based on cross subtasks rainbow hindsight experience replay
    Zhao, Xiaotong
    Du, Jingli
    Wang, Zhihan
    JOURNAL OF COMPUTATIONAL SCIENCE, 2023, 72
  • [38] Compositional Transfer in Hierarchical Reinforcement Learning
    Wulfmeier, Markus
    Abdolmaleki, Abbas
    Hafner, Roland
    Springenberg, Jost Tobias
    Neunert, Michael
    Hertweck, Tim
    Lampe, Thomas
    Siegel, Noah
    Heess, Nicolas
    Riedmiller, Martin
    ROBOTICS: SCIENCE AND SYSTEMS XVI, 2020,
  • [39] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
    Zhang, Cheng
    Ma, Liang
    Schmitz, Alexander
    INTERNATIONAL JOURNAL OF INTELLIGENT ROBOTICS AND APPLICATIONS, 2020, 4 (02) : 217 - 228
  • [40] A sample efficient model-based deep reinforcement learning algorithm with experience replay for robot manipulation
    Cheng Zhang
    Liang Ma
    Alexander Schmitz
    International Journal of Intelligent Robotics and Applications, 2020, 4 : 217 - 228