Hybrid control for combining model-based and model-free reinforcement learning

被引:10
|
作者
Pinosky, Allison [1 ]
Abraham, Ian [2 ]
Broad, Alexander [3 ]
Argall, Brenna [1 ,4 ]
Murphey, Todd D. [1 ]
机构
[1] Northwestern Univ, Dept Mech Engn, 633 Clark St, Evanston, IL 60208 USA
[2] Yale Univ, Dept Mech Engn & Mat Sci, New Haven, CT USA
[3] Boston Dynam, Waltham, MA USA
[4] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA
来源
基金
美国国家科学基金会;
关键词
Reinforcement learning; learning theory; optimal control; hybrid control; DYNAMICS;
D O I
10.1177/02783649221083331
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
We develop an approach to improve the learning capabilities of robotic systems by combining learned predictive models with experience-based state-action policy mappings. Predictive models provide an understanding of the task and the dynamics, while experience-based (model-free) policy mappings encode favorable actions that override planned actions. We refer to our approach of systematically combining model-based and model-free learning methods as hybrid learning. Our approach efficiently learns motor skills and improves the performance of predictive models and experience-based policies. Moreover, our approach enables policies (both model-based and model-free) to be updated using any off-policy reinforcement learning method. We derive a deterministic method of hybrid learning by optimally switching between learning modalities. We adapt our method to a stochastic variation that relaxes some of the key assumptions in the original derivation. Our deterministic and stochastic variations are tested on a variety of robot control benchmark tasks in simulation as well as a hardware manipulation task. We extend our approach for use with imitation learning methods, where experience is provided through demonstrations, and we test the expanded capability with a real-world pick-and-place task. The results show that our method is capable of improving the performance and sample efficiency of learning motor skills in a variety of experimental domains.
引用
收藏
页码:337 / 355
页数:19
相关论文
共 50 条
  • [41] Sim-to-Real Model-Based and Model-Free Deep Reinforcement Learning for Tactile Pushing
    Yang, Max
    Lin, Yijiong
    Church, Alex
    Lloyd, John
    Zhang, Dandan
    Barton, David A. W.
    Lepora, Nathan F.
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (09) : 5480 - 5487
  • [42] Ventral Striatum and Orbitofrontal Cortex Are Both Required for Model-Based, But Not Model-Free, Reinforcement Learning
    McDannald, Michael A.
    Lucantonio, Federica
    Burke, Kathryn A.
    Niv, Yael
    Schoenbaum, Geoffrey
    [J]. JOURNAL OF NEUROSCIENCE, 2011, 31 (07): : 2700 - 2705
  • [43] Curious Meta-Controller: Adaptive Alternation between Model-Based and Model-Free Control in Deep Reinforcement Learning
    Hafez, Muhammad Burhan
    Weber, Cornelius
    Kerzel, Matthias
    Wermter, Stefan
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [44] Model-free control based on reinforcement learning for a wastewater treatment problem
    Syafiie, S.
    Tadeo, F.
    Martinez, E.
    Alvarez, T.
    [J]. APPLIED SOFT COMPUTING, 2011, 11 (01) : 73 - 82
  • [45] Facial feature tracking combining model-based and model-free method
    Wang, JY
    Gao, W
    Shan, SG
    Hu, XP
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING SIGNAL, PROCESSING EDUCATION, 2003, : 205 - 208
  • [46] Facial feature tracking combining model-based and model-free method
    Wang, JY
    Gao, W
    Shan, SG
    Hu, XP
    [J]. 2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 125 - 128
  • [47] Model-free voltage control of active distribution system with PVs using surrogate model-based deep reinforcement learning
    Cao, Di
    Zhao, Junbo
    Hu, Weihao
    Ding, Fei
    Yu, Nanpeng
    Huang, Qi
    Chen, Zhe
    [J]. APPLIED ENERGY, 2022, 306
  • [48] MODEL-FREE PREDICTIVE CONTROL OF NONLINEAR PROCESSES BASED ON REINFORCEMENT LEARNING
    Shah, Hitesh
    Gopal, M.
    [J]. IFAC PAPERSONLINE, 2016, 49 (01): : 89 - 94
  • [49] Model-Free Reinforcement Learning based Lateral Control for Lane Keeping
    Zhang, Qichao
    Luo, Rui
    Zhao, Dongbin
    Luo, Chaomin
    Qian, Dianwei
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [50] Model-based learning retrospectively updates model-free values
    Max Doody
    Maaike M. H. Van Swieten
    Sanjay G. Manohar
    [J]. Scientific Reports, 12