Hybrid control for combining model-based and model-free reinforcement learning

被引:10
|
作者
Pinosky, Allison [1 ]
Abraham, Ian [2 ]
Broad, Alexander [3 ]
Argall, Brenna [1 ,4 ]
Murphey, Todd D. [1 ]
机构
[1] Northwestern Univ, Dept Mech Engn, 633 Clark St, Evanston, IL 60208 USA
[2] Yale Univ, Dept Mech Engn & Mat Sci, New Haven, CT USA
[3] Boston Dynam, Waltham, MA USA
[4] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA
来源
基金
美国国家科学基金会;
关键词
Reinforcement learning; learning theory; optimal control; hybrid control; DYNAMICS;
D O I
10.1177/02783649221083331
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
We develop an approach to improve the learning capabilities of robotic systems by combining learned predictive models with experience-based state-action policy mappings. Predictive models provide an understanding of the task and the dynamics, while experience-based (model-free) policy mappings encode favorable actions that override planned actions. We refer to our approach of systematically combining model-based and model-free learning methods as hybrid learning. Our approach efficiently learns motor skills and improves the performance of predictive models and experience-based policies. Moreover, our approach enables policies (both model-based and model-free) to be updated using any off-policy reinforcement learning method. We derive a deterministic method of hybrid learning by optimally switching between learning modalities. We adapt our method to a stochastic variation that relaxes some of the key assumptions in the original derivation. Our deterministic and stochastic variations are tested on a variety of robot control benchmark tasks in simulation as well as a hardware manipulation task. We extend our approach for use with imitation learning methods, where experience is provided through demonstrations, and we test the expanded capability with a real-world pick-and-place task. The results show that our method is capable of improving the performance and sample efficiency of learning motor skills in a variety of experimental domains.
引用
收藏
页码:337 / 355
页数:19
相关论文
共 50 条
  • [21] Model-based and model-free learning strategies for wet clutch control
    Dutta, Abhishek
    Zhong, Yu
    Depraetere, Bruno
    Van Vaerenbergh, Kevin
    Ionescu, Clara
    Wyns, Bart
    Pinte, Gregory
    Nowe, Ann
    Swevers, Jan
    De Keyser, Robin
    [J]. MECHATRONICS, 2014, 24 (08) : 1008 - 1020
  • [22] Model-Free Quantum Control with Reinforcement Learning
    Sivak, V. V.
    Eickbusch, A.
    Liu, H.
    Royer, B.
    Tsioutsios, I
    Devoret, M. H.
    [J]. PHYSICAL REVIEW X, 2022, 12 (01)
  • [23] Model-Free Control for Soft Manipulators based on Reinforcement Learning
    You, Xuanke
    Zhang, Yixiao
    Chen, Xiaotong
    Liu, Xinghua
    Wang, Zhanchi
    Jiang, Hao
    Chen, Xiaoping
    [J]. 2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 2909 - 2915
  • [24] Model-Free Emergency Frequency Control Based on Reinforcement Learning
    Chen, Chunyu
    Cui, Mingjian
    Li, Fangxing
    Yin, Shengfei
    Wang, Xinan
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (04) : 2336 - 2346
  • [25] Predictive representations can link model-based reinforcement learning to model-free mechanisms
    Russek, Evan M.
    Momennejad, Ida
    Botvinick, Matthew M.
    Gershman, Samuel J.
    Daw, Nathaniel D.
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (09)
  • [26] Extraversion differentiates between model-based and model-free strategies in a reinforcement learning task
    Skatova, Anya
    Chan, Patricia A.
    Daw, Nathaniel D.
    [J]. FRONTIERS IN HUMAN NEUROSCIENCE, 2013, 7
  • [27] Model-free Control for Stratospheric Airship Based on Reinforcement Learning
    Nie, Chunyu
    Zhu, Ming
    Zheng, Zewei
    Wu, Zhe
    [J]. PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 10702 - 10707
  • [28] Dyna-style Model-based reinforcement learning with Model-Free Policy Optimization
    Dong, Kun
    Luo, Yongle
    Wang, Yuxin
    Liu, Yu
    Qu, Chengeng
    Zhang, Qiang
    Cheng, Erkang
    Sun, Zhiyong
    Song, Bo
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 287
  • [29] The modulation of acute stress on model-free and model-based reinforcement learning in gambling disorder
    Wyckmans, Florent
    Banerjee, Nilosmita
    Saeremans, Melanie
    Otto, Ross
    Kornreich, Charles
    Vanderijst, Laetitia
    Gruson, Damien
    Carbone, Vincenzo
    Bechara, Antoine
    Buchanan, Tony
    Noel, Xavier
    [J]. JOURNAL OF BEHAVIORAL ADDICTIONS, 2022, 11 (03) : 831 - 844
  • [30] Fault Tolerant Control combining Reinforcement Learning and Model-based Control
    Bhan, Luke
    Quinones-Grueiro, Marcos
    Biswas, Gautam
    [J]. 5TH CONFERENCE ON CONTROL AND FAULT-TOLERANT SYSTEMS (SYSTOL 2021), 2021, : 31 - 36