A reinforcement learning method with closed-loop stability guarantee

被引:5
|
作者
Osinenko, Pavel [1 ]
Beckenbach, Lukas [1 ]
Goehrt, Thomas [1 ]
Streif, Stefan [1 ]
机构
[1] Tech Univ Chemnitz, Automat Control & Syst Dynam Lab, Chemnitz, Germany
来源
IFAC PAPERSONLINE | 2020年 / 53卷 / 02期
关键词
Reinforcement learning control; Stability of nonlinear systems; Lyapunov methods; ITERATION;
D O I
10.1016/j.ifacol.2020.12.2237
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning (RL) in the context of control systems offers wide possibilities of controller adaptation. Given an infinite-horizon cost function, the so-called critic of RL approximates it with a neural net and sends this information to the controller (called "actor"). However, the issue of closed-loop stability under an RL-method is still not fully addressed. Since the critic delivers merely an approximation to the value function of the corresponding infinitehorizon problem, no guarantee can be given in general as to whether the actor's actions stabilize the system. Different approaches to this issue exist. The current work offers a particular one, which, starting with a (not necessarily smooth) control Lyapunov function (CLF), derives an online RL-scheme in such a way that practical semi-global stability property of the closed-loop can be established. The approach logically continues the work of the authors on parameterized controllers and Lyapunov-like constraints for RL, whereas the CLF now appears merely in one of the constraints of the control scheme. The analysis of the closed-loop behavior is done in a sample-and-hold (SH) manner thus offering a certain insight into the digital realization. The case study with a non-holonomic integrator shows the capabilities of the derived method to optimize the given cost function compared to a nominal stabilizing controller. Copyright (C) 2020 The Authors.
引用
收藏
页码:8043 / 8048
页数:6
相关论文
共 50 条
  • [1] A reinforcement learning method with closed-loop stability guarantee for systems with unknown parameters
    Goehrt, Thomas
    Griesing-Scheiwe, Fritjof
    Osinenko, Pavel
    Streif, Stefan
    IFAC PAPERSONLINE, 2020, 53 (02): : 8157 - 8162
  • [2] A simple method to guarantee closed-loop stability when continuous-time systems are discretized
    O'Malley, M
    de Paor, A
    INTERNATIONAL JOURNAL OF CONTROL, 1998, 69 (06) : 819 - 836
  • [3] Closed-loop control of a noisy qubit with reinforcement learning
    Ding, Yongcheng
    Chen, Xi
    Magdalena-Benedito, Rafael
    Martin-Guerrero, Jose D.
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2023, 4 (02):
  • [4] Delayed reinforcement learning for closed-loop object recognition
    Peng, J
    Bhanu, B
    IMAGE UNDERSTANDING WORKSHOP, 1996 PROCEEDINGS, VOLS I AND II, 1996, : 1429 - 1435
  • [5] Closed-loop object recognition using reinforcement learning
    Peng, J
    Bhanu, B
    1996 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, 1996, : 538 - 543
  • [6] Closed-loop object recognition using reinforcement learning
    Peng, J
    Bhanu, B
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (02) : 139 - 154
  • [7] Closed-loop Rescheduling using Deep Reinforcement Learning
    Palombarini, Jorge A.
    Martinez, Ernesto C.
    IFAC PAPERSONLINE, 2019, 52 (01): : 231 - 236
  • [8] REINFORCEMENT LEARNING FOR TUNING PARAMETERS OF CLOSED-LOOP CONTROLLERS
    Serafini, M. C.
    Rosales, N.
    Garelli, F.
    DIABETES TECHNOLOGY & THERAPEUTICS, 2021, 23 : A84 - A85
  • [9] Closed-loop stability analysis of deep reinforcement learning controlled systems with experimental validation
    Mohiuddin, Mohammed Basheer
    Boiko, Igor
    Azzam, Rana
    Zweiri, Yahya
    IET CONTROL THEORY AND APPLICATIONS, 2024, 18 (13): : 1649 - 1668
  • [10] Closed-loop stability
    VanDoren, Vance
    CONTROL ENGINEERING, 2010, 57 (06) : 64 - 64