Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

被引:0
|
作者
Nagabandi, Anusha [1 ]
Kahn, Gregory [1 ]
Fearing, Ronald S. [1 ]
Levine, Sergey [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance. Model-based algorithms, in principle, can provide for much more efficient learning, but have proven difficult to extend to expressive, high-capacity models such as deep neural networks. In this work, we demonstrate that neural network dynamics models can in fact be combined with model predictive control (MPC) to achieve excellent sample complexity in a model-based reinforcement learning algorithm, producing stable and plausible gaits that accomplish various complex locomotion tasks. We further propose using deep neural network dynamics models to initialize a model-free learner, in order to combine the sample efficiency of model-based approaches with the high task-specific performance of model-free methods. We empirically demonstrate on MuJoCo locomotion tasks that our pure model-based approach trained on just random action data can follow arbitrary trajectories with excellent sample efficiency, and that our hybrid algorithm can accelerate model-free learning on high-speed benchmark tasks, achieving sample efficiency gains of 3 - 5x on swimmer, cheetah, hopper, and ant agents. Videos can be found at https://sites.google.com/view/mbmf
引用
收藏
页码:7579 / 7586
页数:8
相关论文
共 50 条
  • [21] Dyna-style Model-based reinforcement learning with Model-Free Policy Optimization
    Dong, Kun
    Luo, Yongle
    Wang, Yuxin
    Liu, Yu
    Qu, Chengeng
    Zhang, Qiang
    Cheng, Erkang
    Sun, Zhiyong
    Song, Bo
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 287
  • [22] The modulation of acute stress on model-free and model-based reinforcement learning in gambling disorder
    Wyckmans, Florent
    Banerjee, Nilosmita
    Saeremans, Melanie
    Otto, Ross
    Kornreich, Charles
    Vanderijst, Laetitia
    Gruson, Damien
    Carbone, Vincenzo
    Bechara, Antoine
    Buchanan, Tony
    Noel, Xavier
    [J]. JOURNAL OF BEHAVIORAL ADDICTIONS, 2022, 11 (03) : 831 - 844
  • [23] Adaptive Weight Tuning of EWMA Controller via Model-Free Deep Reinforcement Learning
    Ma, Zhu
    Pan, Tianhong
    [J]. IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, 2023, 36 (01) : 91 - 99
  • [24] Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning
    Hafez, Muhammad Burhan
    Weber, Cornelius
    Kerzel, Matthias
    Wermter, Stefan
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [25] States versus Rewards: Dissociable Neural Prediction Error Signals Underlying Model-Based and Model-Free Reinforcement Learning
    Glaescher, Jan
    Daw, Nathaniel
    Dayan, Peter
    O'Doherty, John P.
    [J]. NEURON, 2010, 66 (04) : 585 - 595
  • [26] Model-based decision making and model-free learning
    Drummond, Nicole
    Niv, Yael
    [J]. CURRENT BIOLOGY, 2020, 30 (15) : R860 - R865
  • [27] Model-free voltage control of active distribution system with PVs using surrogate model-based deep reinforcement learning
    Cao, Di
    Zhao, Junbo
    Hu, Weihao
    Ding, Fei
    Yu, Nanpeng
    Huang, Qi
    Chen, Zhe
    [J]. APPLIED ENERGY, 2022, 306
  • [28] Model-Free and Model-Based Active Learning for Regression
    O'Neill, Jack
    Delany, Sarah Jane
    MacNamee, Brian
    [J]. ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, 2017, 513 : 375 - 386
  • [29] LEARNING UNDER UNCERTAINTY: NEURAL MARKERS OF MODEL-FREE AND MODEL-BASED LEARNING IN PROBABILISTIC ENVIRONMENTS
    Wurm, Franz
    Ernst, Benjamin
    Steinhauser, Marco
    [J]. PSYCHOPHYSIOLOGY, 2017, 54 : S127 - S127
  • [30] Comparative study of model-based and model-free reinforcement learning control performance in HVAC systems
    Gao, Cheng
    Wang, Dan
    [J]. JOURNAL OF BUILDING ENGINEERING, 2023, 74