An Online Training Method for Augmenting MPC with Deep Reinforcement Learning

被引:10
|
作者
Bellegarda, Guillaume [1 ]
Byl, Katie [1 ]
机构
[1] Univ Calif Santa Barbara UCSB, Robot Lab, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA
关键词
MODEL;
D O I
10.1109/IROS45743.2020.9341021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent breakthroughs both in reinforcement learning and trajectory optimization have made significant advances towards real world robotic system deployment. Reinforcement learning (RL) can be applied to many problems without needing any modeling or intuition about the system, at the cost of high sample complexity and the inability to prove any metrics about the learned policies. Trajectory optimization (TO) on the other hand allows for stability and robustness analyses on generated motions and trajectories, but is only as good as the often over-simplified derived model, and may have prohibitively expensive computation times for real-time control, for example in contact rich environments. This paper seeks to combine the benefits from these two areas while mitigating their drawbacks by (1) decreasing RL sample complexity by using existing knowledge of the problem with real-time optimal control, and (2) allowing online policy deployment at any point in the training process by using the TO (MPC) as a baseline or worst-case scenario action, while continuously improving the combined learned-optimized policy with deep RL. This method is evaluated on tasks of successively navigating a car model to a series of goal destinations over slippery terrains as fast as possible, in which drifting will allow the system to more quickly change directions while maintaining high speeds.
引用
收藏
页码:5453 / 5459
页数:7
相关论文
共 50 条
  • [1] Augmenting Automated Game Testing with Deep Reinforcement Learning
    Bergdahl, Joakim
    Gordillo, Camilo
    Tollmar, Konrad
    Gisslen, Linus
    [J]. 2020 IEEE CONFERENCE ON GAMES (IEEE COG 2020), 2020, : 600 - 603
  • [2] A Data-Efficient Training Method for Deep Reinforcement Learning
    Feng, Wenhui
    Han, Chongzhao
    Lian, Feng
    Liu, Xia
    [J]. ELECTRONICS, 2022, 11 (24)
  • [3] An Online Deep Reinforcement Learning Based Parameter Identification Method for HVDC System
    Hu, Jianxiong
    Wang, Qi
    Tang, Yi
    [J]. 2021 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2021,
  • [4] Online Adaptation of Deep Architectures with Reinforcement Learning
    Ganegedara, Thushan
    Ott, Lionel
    Ramos, Fabio
    [J]. ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 577 - 585
  • [5] Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning
    George, Abraham
    Bartsch, Alison
    Farimani, Amir Barati
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 5027 - 5033
  • [6] Online Hybrid Learning to Speed Up Deep Reinforcement Learning Method for Commercial Aircraft Control
    Xin, Minjian
    Gao, Yue
    Mou, Tianhao
    Ye, Jianlong
    [J]. 2019 3RD INTERNATIONAL SYMPOSIUM ON AUTONOMOUS SYSTEMS (ISAS 2019), 2019, : 305 - 310
  • [7] Spatiotemporal Costmap Inference for MPC Via Deep Inverse Reinforcement Learning
    Lee, Keuntaek
    Isele, David
    Theodorou, Evangelos A.
    Bae, Sangjae
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 3194 - 3201
  • [8] Reinforcement Learning based on MPC and the Stochastic Policy Gradient Method
    Gros, Sebastien
    Zanon, Mario
    [J]. 2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1947 - 1952
  • [9] Two-Stage Population Based Training Method for Deep Reinforcement Learning
    Zhou, Yinda
    Liu, Weiming
    Li, Bin
    [J]. 2019 THE 3RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPILATION, COMPUTING AND COMMUNICATIONS (HP3C 2019), 2019, : 38 - 44
  • [10] The State of Sparse Training in Deep Reinforcement Learning
    Graesser, Laura
    Evci, Utku
    Elsen, Erich
    Castro, Pablo Samuel
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,