A projected primal-dual gradient optimal control method for deep reinforcement learning

被引:0
|
作者
Simon Gottschalk
Michael Burger
Matthias Gerdts
机构
[1] Fraunhofer ITWM,
[2] Universität der Bundeswehr,undefined
关键词
Reinforcement learning; Optimal control; Necessary optimality conditions; 49K15; 90C40; 93E35;
D O I
暂无
中图分类号
学科分类号
摘要
In this contribution, we start with a policy-based Reinforcement Learning ansatz using neural networks. The underlying Markov Decision Process consists of a transition probability representing the dynamical system and a policy realized by a neural network mapping the current state to parameters of a distribution. Therefrom, the next control can be sampled. In this setting, the neural network is replaced by an ODE, which is based on a recently discussed interpretation of neural networks. The resulting infinite optimization problem is transformed into an optimization problem similar to the well-known optimal control problems. Afterwards, the necessary optimality conditions are established and from this a new numerical algorithm is derived. The operating principle is shown with two examples. It is applied to a simple example, where a moving point is steered through an obstacle course to a desired end position in a 2D plane. The second example shows the applicability to more complex problems. There, the aim is to control the finger tip of a human arm model with five degrees of freedom and 29 Hill’s muscle models to a desired end position.
引用
收藏
相关论文
共 50 条
  • [31] Time-Varying Optimization of LTI Systems Via Projected Primal-Dual Gradient Flows
    Bianchin, Gianluca
    Cortes, Jorge
    Poveda, Jorge I.
    Dall'Anese, Emiliano
    [J]. IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2022, 9 (01): : 474 - 486
  • [32] Primal-dual hybrid gradient method for distributionally robust optimization problems
    Liu, Yongchao
    Yuan, Xiaoming
    Zeng, Shangzhi
    Zhang, Jin
    [J]. OPERATIONS RESEARCH LETTERS, 2017, 45 (06) : 625 - 630
  • [33] Faster PET Reconstruction with a Stochastic Primal-Dual Hybrid Gradient Method
    Ehrhardt, Matthias J.
    Markiewicz, Pawel
    Chambolle, Antonin
    Richtarik, Peter
    Schott, Jonathan
    Schonlieb, Carola-Bibiane
    [J]. WAVELETS AND SPARSITY XVII, 2017, 10394
  • [34] On the equivalence of the primal-dual hybrid gradient method and Douglas–Rachford splitting
    Daniel O’Connor
    Lieven Vandenberghe
    [J]. Mathematical Programming, 2020, 179 : 85 - 108
  • [35] ON THE CONVERGENCE OF STOCHASTIC PRIMAL-DUAL HYBRID GRADIENT
    Alacaoglu, Ahmet
    Fercoq, Olivier
    Cevher, Volkan
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2022, 32 (02) : 1288 - 1318
  • [36] On the Convergence of Primal-Dual Hybrid Gradient Algorithm
    He, Bingsheng
    You, Yanfei
    Yuan, Xiaoming
    [J]. SIAM JOURNAL ON IMAGING SCIENCES, 2014, 7 (04): : 2526 - 2537
  • [37] A primal-dual hybrid gradient method for nonlinear operators with applications to MRI
    Valkonen, Tuomo
    [J]. INVERSE PROBLEMS, 2014, 30 (05)
  • [38] Primal-dual incremental gradient method for nonsmooth and convex optimization problems
    Jalilzadeh, Afrooz
    [J]. OPTIMIZATION LETTERS, 2021, 15 (08) : 2541 - 2554
  • [39] Primal-dual incremental gradient method for nonsmooth and convex optimization problems
    Afrooz Jalilzadeh
    [J]. Optimization Letters, 2021, 15 : 2541 - 2554
  • [40] On the Exponential Stability of Primal-Dual Gradient Dynamics
    Qu, Guannan
    Li, Na
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2019, 3 (01): : 43 - 48