Energy-Based Continuous Inverse Optimal Control

被引:2
|
作者
Xu, Yifei [1 ]
Xie, Jianwen [2 ]
Zhao, Tianyang [1 ]
Baker, Chris [3 ]
Zhao, Yibiao [3 ]
Wu, Ying Nian [1 ]
机构
[1] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA
[2] Baidu Res, Cognit Comp Lab, Bellevue, WA 98004 USA
[3] iSee Inc, Cambridge, MA 02139 USA
关键词
Trajectory; Cost function; Optimal control; Heuristic algorithms; Generators; Autonomous vehicles; Maximum likelihood estimation; Cooperative learning; energy-based models (EBMs); inverse optimal control (IOC); Langevin dynamics; 3D SHAPE SYNTHESIS; MODELS; NETWORKS; FRAME;
D O I
10.1109/TNNLS.2022.3168795
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of continuous inverse optimal control (over finite time horizon) is to learn the unknown cost function over the sequence of continuous control variables from expert demonstrations. In this article, we study this fundamental problem in the framework of energy-based model (EBM), where the observed expert trajectories are assumed to be random samples from a probability density function defined as the exponential of the negative cost function up to a normalizing constant. The parameters of the cost function are learned by maximum likelihood via an ``analysis by synthesis'' scheme, which iterates: 1) synthesis step: sample the synthesized trajectories from the current probability density using the Langevin dynamics via backpropagation through time and 2) analysis step: update the model parameters based on the statistical difference between the synthesized trajectories and the observed trajectories. Given the fact that an efficient optimization algorithm is usually available for an optimal control problem, we also consider a convenient approximation of the above learning method, where we replace the sampling in the synthesis step by optimization. Moreover, to make the sampling or optimization more efficient, we propose to train the EBM simultaneously with a top-down trajectory generator via cooperative learning, where the trajectory generator is used to fast initialize the synthesis step of the EBM. We demonstrate the proposed methods on autonomous driving tasks and show that they can learn suitable cost functions for optimal control.
引用
收藏
页码:10563 / 10577
页数:15
相关论文
共 50 条
  • [1] Optimal distributed control of renewable energy-based microgrid–an energy management approach
    Nagaraja Y.
    Devaraju T.
    Vijaykumar M.
    International Journal of Ambient Energy, 2021, 42 (14) : 1635 - 1642
  • [2] Energy-based swing-back control for continuous brachiation of a multilocomotion robot
    Kajima, Hideki
    Hasegawa, Yasuhisa
    Doi, Masahiro
    Fukuda, Toshio
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2006, 21 (09) : 1025 - 1043
  • [3] Learning the Optimal Energy-based Control Strategy for Port-Hamiltonian Systems
    Zanella, Riccardo
    Macchelli, Alessandro
    Stramigioli, Stefano
    IFAC PAPERSONLINE, 2024, 58 (06): : 208 - 213
  • [4] An Energy-Based Optimal Control Problem for Unmanned Aircraft Systems Flight Planning
    Liu, Zhilong
    Kurzhanskiy, Alex
    Sengupta, Raja
    2017 56TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2017, : 1320 - 1325
  • [5] Optimal Motion Planning and Energy-Based Control of a Single Mast Stacker Crane
    Rams, Hubert
    Schoeberl, Markus
    Schlacher, Kurt
    IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2018, 26 (04) : 1449 - 1457
  • [6] Energy-Based Optimal Step Planning for Humanoids
    Huang, Weiwei
    Kim, Junggon
    Atkeson, Christopher G.
    2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2013, : 3124 - 3129
  • [7] Optimal Control and Inverse Optimal Control with Continuous Updating for Human Behavior Modeling
    Petrosian, Ovanes
    Inga, Jairo
    Kuchkarov, Ildus
    Flad, Michael
    Hohmann, Soeren
    IFAC PAPERSONLINE, 2020, 53 (02): : 6670 - 6677
  • [8] Energy-based nonlinear control of Pendubot
    Fu, Xuedong
    Pei, Hailong
    Wu, Guozhao
    Jiqiren/Robot, 2000, 22 (06): : 451 - 456
  • [9] Optimal Multivariable MMC Energy-Based Control for DC Voltage Regulation in HVDC Applications
    Prieto-Araujo, Eduardo
    Gross, Dominic
    Doerfler, Florian
    Gomis-Bellmunt, Oriol
    2020 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2020,
  • [10] Energy-based comparative analysis of optimal active control schemes for clustered tensegrity structures
    Feng, Xiaodong
    Ou, Yaowen
    Miah, Mohammad S.
    STRUCTURAL CONTROL & HEALTH MONITORING, 2018, 25 (10):