Hybrid control for combining model-based and model-free reinforcement learning

被引：10

作者：

Pinosky, Allison ^{[1
]}

Abraham, Ian ^{[2
]}

Broad, Alexander ^{[3
]}

Argall, Brenna ^{[1
,4
]}

Murphey, Todd D. ^{[1
]}

机构：

[1] Northwestern Univ, Dept Mech Engn, 633 Clark St, Evanston, IL 60208 USA

[2] Yale Univ, Dept Mech Engn & Mat Sci, New Haven, CT USA

[3] Boston Dynam, Waltham, MA USA

[4] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA

来源：

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH | 2023年 / 42卷 / 06期

基金：

美国国家科学基金会;

关键词：

Reinforcement learning; learning theory; optimal control; hybrid control; DYNAMICS;

D O I：

10.1177/02783649221083331

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

We develop an approach to improve the learning capabilities of robotic systems by combining learned predictive models with experience-based state-action policy mappings. Predictive models provide an understanding of the task and the dynamics, while experience-based (model-free) policy mappings encode favorable actions that override planned actions. We refer to our approach of systematically combining model-based and model-free learning methods as hybrid learning. Our approach efficiently learns motor skills and improves the performance of predictive models and experience-based policies. Moreover, our approach enables policies (both model-based and model-free) to be updated using any off-policy reinforcement learning method. We derive a deterministic method of hybrid learning by optimally switching between learning modalities. We adapt our method to a stochastic variation that relaxes some of the key assumptions in the original derivation. Our deterministic and stochastic variations are tested on a variety of robot control benchmark tasks in simulation as well as a hardware manipulation task. We extend our approach for use with imitation learning methods, where experience is provided through demonstrations, and we test the expanded capability with a real-world pick-and-place task. The results show that our method is capable of improving the performance and sample efficiency of learning motor skills in a variety of experimental domains.

引用

页码：337 / 355

页数：19

共 50 条

[41] Sim-to-Real Model-Based and Model-Free Deep Reinforcement Learning for Tactile Pushing
Yang, Max
Lin, Yijiong
Church, Alex
Lloyd, John
Zhang, Dandan
Barton, David A. W.
Lepora, Nathan F.
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (09) : 5480 - 5487
[42] Ventral Striatum and Orbitofrontal Cortex Are Both Required for Model-Based, But Not Model-Free, Reinforcement Learning
McDannald, Michael A.
Lucantonio, Federica
Burke, Kathryn A.
Niv, Yael
Schoenbaum, Geoffrey
[J]. JOURNAL OF NEUROSCIENCE, 2011, 31 (07): : 2700 - 2705
[43] Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning
Hafez, Muhammad Burhan
Weber, Cornelius
Kerzel, Matthias
Wermter, Stefan
[J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[44] Model-free control based on reinforcement learning for a wastewater treatment problem
Syafiie, S.
Tadeo, F.
Martinez, E.
Alvarez, T.
[J]. APPLIED SOFT COMPUTING, 2011, 11 (01) : 73 - 82
[45] Facial feature tracking combining model-based and model-free method
Wang, JY
Gao, W
Shan, SG
Hu, XP
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING SIGNAL, PROCESSING EDUCATION, 2003, : 205 - 208
[46] Facial feature tracking combining model-based and model-free method
Wang, JY
Gao, W
Shan, SG
Hu, XP
[J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 125 - 128
[47] Model-free voltage control of active distribution system with PVs using surrogate model-based deep reinforcement learning
Cao, Di
Zhao, Junbo
Hu, Weihao
Ding, Fei
Yu, Nanpeng
Huang, Qi
Chen, Zhe
[J]. APPLIED ENERGY, 2022, 306
[48] MODEL-FREE PREDICTIVE CONTROL OF NONLINEAR PROCESSES BASED ON REINFORCEMENT LEARNING
Shah, Hitesh
Gopal, M.
[J]. IFAC PAPERSONLINE, 2016, 49 (01): : 89 - 94
[49] Model-Free Reinforcement Learning based Lateral Control for Lane Keeping
Zhang, Qichao
Luo, Rui
Zhao, Dongbin
Luo, Chaomin
Qian, Dianwei
[J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[50] Model-based learning retrospectively updates model-free values
Max Doody
Maaike M. H. Van Swieten
Sanjay G. Manohar
[J]. Scientific Reports, 12

← 1 2 3 4 5 →