Safety-Critical Optimal Control of Discrete-Time Non-Linear Systems via Policy Iteration-Based Q-Learning

被引:0
|
作者
Long, Lijun [1 ,2 ]
Liu, Xiaomei [1 ,2 ]
Huang, Xiaomin [1 ,2 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang, Peoples R China
[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang, Peoples R China
关键词
control barrier functions; discrete-time systems; neural networks; Q-learning; safety-critical control;
D O I
10.1002/rnc.7809
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper investigates the problem of safety-critical optimal control for discrete-time non-linear systems. A safety-critical control algorithm is developed based on Q-learning and an iterative adaptive dynamic programming, that is, policy iteration. Discrete-time control barrier functions (CBFs) are introduced into the utility function for guaranteeing safety, in which a novel definition of the safe set and its boundary with multiple discrete-time CBFs are given. Also, for discrete-time systems, by using multiple discrete-time CBFs, the safety-critical optimal control problem of multiple safety objectives is addressed. Meanwhile, safety, convergence, and stability of the developed algorithm are rigorously demonstrated. An effective method to obtain an initial safety-admissible control law is established. Also, the developed algorithm is implemented by building an actor-critic structure with neural networks. Finally, the effectiveness of the proposed algorithm is illustrated by three simulation examples.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Generalized Policy Iteration-based Reinforcement Learning Algorithm for Optimal Control of Unknown Discrete-time Systems
    Lin, Mingduo
    Zhao, Bo
    Liu, Derong
    Liu, Xi
    Luo, Fangchao
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3650 - 3655
  • [2] Online Adaptive Optimal Control of Discrete-time Linear Systems via Synchronous Q-learning
    Li, Xinxing
    Wang, Xueyuan
    Zha, Wenzhong
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2024 - 2029
  • [3] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
    Wei QingLai
    Liu DeRong
    SCIENCE CHINA-INFORMATION SCIENCES, 2015, 58 (12) : 1 - 15
  • [4] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
    WEI QingLai
    LIU DeRong
    ScienceChina(InformationSciences), 2015, 58 (12) : 147 - 161
  • [5] Bias-Policy Iteration-Based Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems
    Jiang, Huaiyuan
    Li, Xiang
    Zhou, Bin
    Cao, Xibin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024,
  • [6] Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics
    Kiumarsi, Bahare
    Lewis, Frank L.
    Modares, Hamidreza
    Karimpour, Ali
    Naghibi-Sistani, Mohammad-Bagher
    AUTOMATICA, 2014, 50 (04) : 1167 - 1175
  • [7] Optimal control for discrete-time affine non-linear systems using general value iteration
    Li, H.
    Liu, D.
    IET CONTROL THEORY AND APPLICATIONS, 2012, 6 (18): : 2725 - 2736
  • [8] Discrete-Time Optimal Control Scheme Based on Q-Learning Algorithm
    Wei, Qinglai
    Liu, Derong
    Song, Ruizhuo
    2016 SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2016, : 125 - 130
  • [9] Stochastic linear quadratic optimal tracking control for discrete-time systems with delays based on Q-learning algorithm
    Tan, Xufeng
    Li, Yuan
    Liu, Yang
    AIMS MATHEMATICS, 2023, 8 (05): : 10249 - 10265
  • [10] Optimal discrete-time control for non-linear cascade systems
    Haddad, WM
    Chellaboina, VS
    Fausz, JL
    Abdallah, C
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 1998, 335B (05): : 827 - 839