An Online Training Method for Augmenting MPC with Deep Reinforcement Learning

被引：10

作者：

Bellegarda, Guillaume ^{[1
]}

Byl, Katie ^{[1
]}

机构：

[1] Univ Calif Santa Barbara UCSB, Robot Lab, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA

来源：

2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2020年

关键词：

MODEL;

D O I：

10.1109/IROS45743.2020.9341021

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent breakthroughs both in reinforcement learning and trajectory optimization have made significant advances towards real world robotic system deployment. Reinforcement learning (RL) can be applied to many problems without needing any modeling or intuition about the system, at the cost of high sample complexity and the inability to prove any metrics about the learned policies. Trajectory optimization (TO) on the other hand allows for stability and robustness analyses on generated motions and trajectories, but is only as good as the often over-simplified derived model, and may have prohibitively expensive computation times for real-time control, for example in contact rich environments. This paper seeks to combine the benefits from these two areas while mitigating their drawbacks by (1) decreasing RL sample complexity by using existing knowledge of the problem with real-time optimal control, and (2) allowing online policy deployment at any point in the training process by using the TO (MPC) as a baseline or worst-case scenario action, while continuously improving the combined learned-optimized policy with deep RL. This method is evaluated on tasks of successively navigating a car model to a series of goal destinations over slippery terrains as fast as possible, in which drifting will allow the system to more quickly change directions while maintaining high speeds.

引用

页码：5453 / 5459

页数：7

共 50 条

[1] Augmenting Automated Game Testing with Deep Reinforcement Learning
Bergdahl, Joakim
Gordillo, Camilo
Tollmar, Konrad
Gisslen, Linus
[J]. 2020 IEEE CONFERENCE ON GAMES (IEEE COG 2020), 2020, : 600 - 603
[2] A Data-Efficient Training Method for Deep Reinforcement Learning
Feng, Wenhui
Han, Chongzhao
Lian, Feng
Liu, Xia
[J]. ELECTRONICS, 2022, 11 (24)
[3] An Online Deep Reinforcement Learning Based Parameter Identification Method for HVDC System
Hu, Jianxiong
Wang, Qi
Tang, Yi
[J]. 2021 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2021,
[4] Online Adaptation of Deep Architectures with Reinforcement Learning
Ganegedara, Thushan
Ott, Lionel
Ramos, Fabio
[J]. ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 577 - 585
[5] Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning
George, Abraham
Bartsch, Alison
Farimani, Amir Barati
[J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 5027 - 5033
[6] Online Hybrid Learning to Speed Up Deep Reinforcement Learning Method for Commercial Aircraft Control
Xin, Minjian
Gao, Yue
Mou, Tianhao
Ye, Jianlong
[J]. 2019 3RD INTERNATIONAL SYMPOSIUM ON AUTONOMOUS SYSTEMS (ISAS 2019), 2019, : 305 - 310
[7] Spatiotemporal Costmap Inference for MPC Via Deep Inverse Reinforcement Learning
Lee, Keuntaek
Isele, David
Theodorou, Evangelos A.
Bae, Sangjae
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) : 3194 - 3201
[8] Reinforcement Learning based on MPC and the Stochastic Policy Gradient Method
Gros, Sebastien
Zanon, Mario
[J]. 2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1947 - 1952
[9] Two-Stage Population Based Training Method for Deep Reinforcement Learning
Zhou, Yinda
Liu, Weiming
Li, Bin
[J]. 2019 THE 3RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPILATION, COMPUTING AND COMMUNICATIONS (HP3C 2019), 2019, : 38 - 44
[10] The State of Sparse Training in Deep Reinforcement Learning
Graesser, Laura
Evci, Utku
Elsen, Erich
Castro, Pablo Samuel
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,

← 1 2 3 4 5 →