Multiple Actor-Critic Structures for Continuous-Time Optimal Control Using Input-Output Data

被引:124
|
作者
Song, Ruizhuo [1 ]
Lewis, Frank [2 ,3 ]
Wei, Qinglai [4 ]
Zhang, Hua-Guang [5 ]
Jiang, Zhong-Ping [6 ]
Levine, Dan [7 ]
机构
[1] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX USA
[3] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110004, Peoples R China
[4] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[5] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Peoples R China
[6] NYU, Polytech Sch Engn, Dept Elect & Comp Engn, Brooklyn, NY 11201 USA
[7] Univ Texas Arlington, Dept Psychol, Arlington, TX 76019 USA
基金
中国国家自然科学基金; 美国国家科学基金会; 北京市自然科学基金;
关键词
Actor-critic; approximate dynamic programming (ADP); category; optimal control; shunting inhibitory artificial neural network (SIANN); MULTIOBJECTIVE OPTIMAL-CONTROL; DYNAMIC-PROGRAMMING ALGORITHM; UNKNOWN NONLINEAR-SYSTEMS; OPTIMAL TRACKING CONTROL; OPTIMAL-CONTROL SCHEME; ZERO-SUM GAMES; ADAPTIVE-CONTROL; FEEDBACK-CONTROL; EMOTIONAL INFLUENCES; LEARNING ALGORITHM;
D O I
10.1109/TNNLS.2015.2399020
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In industrial process control, there may be multiple performance objectives, depending on salient features of the input-output data. Aiming at this situation, this paper proposes multiple actor-critic structures to obtain the optimal control via input-output data for unknown nonlinear systems. The shunting inhibitory artificial neural network (SIANN) is used to classify the input-output data into one of several categories. Different performance measure functions may be defined for disparate categories. The approximate dynamic programming algorithm, which contains model module, critic network, and action network, is used to establish the optimal control in each category. A recurrent neural network (RNN) model is used to reconstruct the unknown system dynamics using input-output data. NNs are used to approximate the critic and action networks, respectively. It is proven that the model error and the closed unknown system are uniformly ultimately bounded. Simulation results demonstrate the performance of the proposed optimal control scheme for the unknown nonlinear system.
引用
收藏
页码:851 / 865
页数:15
相关论文
共 50 条
  • [1] Optimal Control of Affine Nonlinear Continuous-time Systems Using Online Actor-Critic Algorithm
    Chen Xue-song
    Yang Ming-sheng
    Liu Fu-chun
    2013 32ND CHINESE CONTROL CONFERENCE (CCC), 2013, : 2891 - 2894
  • [2] Relaxed Actor-Critic With Convergence Guarantees for Continuous-Time Optimal Control of Nonlinear Systems
    Duan, Jingliang
    Li, Jie
    Ge, Qiang
    Li, Shengbo Eben
    Bujarbaruah, Monimoy
    Ma, Fei
    Zhang, Dezhao
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (05): : 3299 - 3311
  • [3] Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem
    Vamvoudakis, Kyriakos G.
    Lewis, Frank L.
    AUTOMATICA, 2010, 46 (05) : 878 - 888
  • [5] Optimal Consensus Control for Continuous-time Multi-agent Systems via Actor-Critic Neural Networks
    Jia, Xiao
    Wolter, Katinka
    2022 8TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2022), 2022, : 191 - 195
  • [6] Online actor-critic algorithm to solve the approximate optimal adaptive control of continuous-time system with state delay
    Wu, Yange
    Wei, Jumei
    Zhu, Xunlin
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 2995 - 3000
  • [7] Continuous-Time Input-Output Linear Dynamic System Identification using Sampled Data
    Figwer, Jaroslaw
    2015 20TH INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN AUTOMATION AND ROBOTICS (MMAR), 2015, : 712 - 717
  • [8] Continuous-time input-output decoupling for sampled-data systems
    Grasselli, OM
    Menini, L
    KYBERNETIKA, 1999, 35 (06) : 721 - 735
  • [9] Robust Actor-Critic Learning for Continuous-Time Nonlinear Systems With Unmodeled Dynamics
    Yang, Yongliang
    Gao, Weinan
    Modares, Hamidreza
    Xu, Cheng-Zhong
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (06) : 2101 - 2112
  • [10] Input-output identifiability of continuous-time linear systems
    Harrison, KJ
    Partington, JR
    Ward, JA
    JOURNAL OF COMPLEXITY, 2002, 18 (01) : 210 - 223