Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

被引：0

作者：

Nagabandi, Anusha ^{[1
]}

Kahn, Gregory ^{[1
]}

Fearing, Ronald S. ^{[1
]}

Levine, Sergey ^{[1
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) | 2018年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance. Model-based algorithms, in principle, can provide for much more efficient learning, but have proven difficult to extend to expressive, high-capacity models such as deep neural networks. In this work, we demonstrate that neural network dynamics models can in fact be combined with model predictive control (MPC) to achieve excellent sample complexity in a model-based reinforcement learning algorithm, producing stable and plausible gaits that accomplish various complex locomotion tasks. We further propose using deep neural network dynamics models to initialize a model-free learner, in order to combine the sample efficiency of model-based approaches with the high task-specific performance of model-free methods. We empirically demonstrate on MuJoCo locomotion tasks that our pure model-based approach trained on just random action data can follow arbitrary trajectories with excellent sample efficiency, and that our hybrid algorithm can accelerate model-free learning on high-speed benchmark tasks, achieving sample efficiency gains of 3 - 5x on swimmer, cheetah, hopper, and ant agents. Videos can be found at https://sites.google.com/view/mbmf

引用

页码：7579 / 7586

页数：8

共 50 条

[21] Dyna-style Model-based reinforcement learning with Model-Free Policy Optimization
Dong, Kun
Luo, Yongle
Wang, Yuxin
Liu, Yu
Qu, Chengeng
Zhang, Qiang
Cheng, Erkang
Sun, Zhiyong
Song, Bo
[J]. KNOWLEDGE-BASED SYSTEMS, 2024, 287
[22] The modulation of acute stress on model-free and model-based reinforcement learning in gambling disorder
Wyckmans, Florent
Banerjee, Nilosmita
Saeremans, Melanie
Otto, Ross
Kornreich, Charles
Vanderijst, Laetitia
Gruson, Damien
Carbone, Vincenzo
Bechara, Antoine
Buchanan, Tony
Noel, Xavier
[J]. JOURNAL OF BEHAVIORAL ADDICTIONS, 2022, 11 (03) : 831 - 844
[23] Adaptive Weight Tuning of EWMA Controller via Model-Free Deep Reinforcement Learning
Ma, Zhu
Pan, Tianhong
[J]. IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, 2023, 36 (01) : 91 - 99
[24] Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning
Hafez, Muhammad Burhan
Weber, Cornelius
Kerzel, Matthias
Wermter, Stefan
[J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[25] States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning
Glaescher, Jan
Daw, Nathaniel
Dayan, Peter
O'Doherty, John P.
[J]. NEURON, 2010, 66 (04) : 585 - 595
[26] Model-based decision making and model-free learning
Drummond, Nicole
Niv, Yael
[J]. CURRENT BIOLOGY, 2020, 30 (15) : R860 - R865
[27] Model-free voltage control of active distribution system with PVs using surrogate model-based deep reinforcement learning
Cao, Di
Zhao, Junbo
Hu, Weihao
Ding, Fei
Yu, Nanpeng
Huang, Qi
Chen, Zhe
[J]. APPLIED ENERGY, 2022, 306
[28] Model-Free and Model-Based Active Learning for Regression
O'Neill, Jack
Delany, Sarah Jane
MacNamee, Brian
[J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, 2017, 513 : 375 - 386
[29] LEARNING UNDER UNCERTAINTY: NEURAL MARKERS OF MODEL-FREE AND MODEL-BASED LEARNING IN PROBABILISTIC ENVIRONMENTS
Wurm, Franz
Ernst, Benjamin
Steinhauser, Marco
[J]. PSYCHOPHYSIOLOGY, 2017, 54 : S127 - S127
[30] Comparative study of model-based and model-free reinforcement learning control performance in HVAC systems
Gao, Cheng
Wang, Dan
[J]. JOURNAL OF BUILDING ENGINEERING, 2023, 74

← 1 2 3 4 5 →