Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

被引：0

作者：

Nagabandi, Anusha ^{[1
]}

Kahn, Gregory ^{[1
]}

Fearing, Ronald S. ^{[1
]}

Levine, Sergey ^{[1
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) | 2018年

基金：

美国国家科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance. Model-based algorithms, in principle, can provide for much more efficient learning, but have proven difficult to extend to expressive, high-capacity models such as deep neural networks. In this work, we demonstrate that neural network dynamics models can in fact be combined with model predictive control (MPC) to achieve excellent sample complexity in a model-based reinforcement learning algorithm, producing stable and plausible gaits that accomplish various complex locomotion tasks. We further propose using deep neural network dynamics models to initialize a model-free learner, in order to combine the sample efficiency of model-based approaches with the high task-specific performance of model-free methods. We empirically demonstrate on MuJoCo locomotion tasks that our pure model-based approach trained on just random action data can follow arbitrary trajectories with excellent sample efficiency, and that our hybrid algorithm can accelerate model-free learning on high-speed benchmark tasks, achieving sample efficiency gains of 3 - 5x on swimmer, cheetah, hopper, and ant agents. Videos can be found at https://sites.google.com/view/mbmf

引用

页码：7579 / 7586

页数：8

共 50 条

[1] Model-based and Model-free Reinforcement Learning for Visual Servoing
Farahmand, Amir Massoud
Shademan, Azad
Jagersand, Martin
Szepesvari, Csaba
[J]. ICRA: 2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-7, 2009, : 4135 - 4142
[2] Sim-to-Real Model-Based and Model-Free Deep Reinforcement Learning for Tactile Pushing
Yang, Max
Lin, Yijiong
Church, Alex
Lloyd, John
Zhang, Dandan
Barton, David A. W.
Lepora, Nathan F.
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (09) : 5480 - 5487
[3] Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics
Massi, Elisa
Barthelemy, Jeanne
Mailly, Juliane
Dromnelle, Remi
Canitrot, Julien
Poniatowski, Esther
Girard, Benoit
Khamassi, Mehdi
[J]. FRONTIERS IN NEUROROBOTICS, 2022, 16
[4] Expert Initialized Hybrid Model-Based and Model-Free Reinforcement Learning
Langaa, Jeppe
Sloth, Christoffer
[J]. 2023 EUROPEAN CONTROL CONFERENCE, ECC, 2023,
[5] Hybrid control for combining model-based and model-free reinforcement learning
Pinosky, Allison
Abraham, Ian
Broad, Alexander
Argall, Brenna
Murphey, Todd D.
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2023, 42 (06): : 337 - 355
[6] Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning
Swazinna, Phillip
Udluft, Steffen
Hein, Daniel
Runkler, Thomas
[J]. IFAC PAPERSONLINE, 2022, 55 (15): : 19 - 26
[7] EEG-based classification of learning strategies : model-based and model-free reinforcement learning
Kim, Dongjae
Weston, Charles
Lee, Sang Wan
[J]. 2018 6TH INTERNATIONAL CONFERENCE ON BRAIN-COMPUTER INTERFACE (BCI), 2018, : 146 - 148
[8] Parallel model-based and model-free reinforcement learning for card sorting performance
Steinke, Alexander
Lange, Florian
Kopp, Bruno
[J]. SCIENTIFIC REPORTS, 2020, 10 (01)
[9] Successor Features Combine Elements of Model-Free and Model-based Reinforcement Learning
Lehnert, Lucas
Littman, Michael L.
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
[10] Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning
Doll, Bradley B.
Bath, Kevin G.
Daw, Nathaniel D.
Frank, Michael J.
[J]. JOURNAL OF NEUROSCIENCE, 2016, 36 (04): : 1211 - 1222

← 1 2 3 4 5 →