Optimal Synchronization Control of Multiagent Systems With Input Saturation via Off-Policy Reinforcement Learning

被引:102
|
作者
Qin, Jiahu [1 ]
Li, Man [1 ]
Shi, Yang [2 ]
Ma, Qichao [1 ]
Zheng, Wei Xing [3 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei 230027, Anhui, Peoples R China
[2] Univ Victoria, Dept Mech Engn, Victoria, BC V8W 2Y2, Canada
[3] Western Sydney Univ, Sch Comp Engn & Math, Sydney, NSW 2751, Australia
基金
澳大利亚研究理事会; 中国国家自然科学基金;
关键词
Input saturation; multiagent systems; neural networks (NNs); off-policy reinforcement learning (RL); optimal synchronization control; LINEAR-SYSTEMS; NONLINEAR-SYSTEMS; NETWORKS; GAMES;
D O I
10.1109/TNNLS.2018.2832025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we aim to investigate the optimal synchronization problem for a group of generic linear systems with input saturation. To seek the optimal controller, Hamilton Jacobi-Bellman (HJB) equations involving nonquadratic input energy terms in coupled forms are established. The solutions to these coupled HJB equations are further proven to be optimal and the induced controllers constitute interactive Nash equilibrium. Due to the difficulty to analytically solve HJB equations, especially in coupled forms, and the possible lack of model information of the systems, we apply the data-based off-policy reinforcement learning algorithm to learn the optimal control policies. A byproduct of this off-policy algorithm is shown that it is insensitive to probing noise that is exerted to the system to maintain persistence of excitation condition. In order to implement this off-policy algorithm, we employ actor and critic neural networks to approximate the controllers and the cost functions. Furthermore, the estimated control policies obtained by this presented implementation are proven to converge to the optimal ones under certain conditions. Finally, an illustrative example is provided to verify the effectiveness of the proposed algorithm.
引用
收藏
页码:85 / 96
页数:12
相关论文
共 50 条
  • [41] Sequential Search with Off-Policy Reinforcement Learning
    Miao, Dadong
    Wang, Yanan
    Tang, Guoyu
    Liu, Lin
    Xu, Sulong
    Long, Bo
    Xiao, Yun
    Wu, Lingfei
    Jiang, Yunjiang
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 4006 - 4015
  • [42] Learning Routines for Effective Off-Policy Reinforcement Learning
    Cetin, Edoardo
    Celiktutan, Oya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [43] Off-policy Reinforcement Learning for Robust Control of Discrete-time Uncertain Linear Systems
    Yang, Yongliang
    Guo, Zhishan
    Wunsch, Donald
    Yin, Yixin
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 2507 - 2512
  • [44] Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
    Xie, Tengyang
    Ma, Yifei
    Wang, Yu-Xiang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [45] Off-Policy Reinforcement Learning: Optimal Operational Control for Two-Time-Scale Industrial Processes
    Li, Jinna
    Kiumarsi, Bahare
    Chai, Tianyou
    Lewis, Frank L.
    Fan, Jialu
    IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (12) : 4547 - 4558
  • [46] Synchronous optimal control method for nonlinear systems with saturating actuators and unknown dynamics using off-policy integral reinforcement learning
    Zhang, Zenglian
    Song, Ruizhuo
    Cao, Min
    NEUROCOMPUTING, 2019, 356 : 162 - 169
  • [47] Off-Policy Risk-Sensitive Reinforcement Learning-Based Constrained Robust Optimal Control
    Li, Cong
    Liu, Qingchen
    Zhou, Zhehua
    Buss, Martin
    Liu, Fangzhou
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (04): : 2478 - 2491
  • [48] Application of Off-policy Integral Reinforcement Learning for H∞, Input Constrained Control of Permanent Magnet Synchronous Machine
    Yu, Yang
    Wang, Shuai
    Du, Yudong
    Su, Rong
    Viswanathan, V.
    Ramakrishna, S.
    Gajanayake, C. J.
    Gupta, Amit K.
    THIRTY-FOURTH ANNUAL IEEE APPLIED POWER ELECTRONICS CONFERENCE AND EXPOSITION (APEC 2019), 2019, : 2570 - 2576
  • [49] Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning
    Modares, Hamidreza
    Lewis, Frank L.
    Jiang, Zhong-Ping
    IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (11) : 2401 - 2410
  • [50] Off-policy and on-policy reinforcement learning with the Tsetlin machine
    Saeed Rahimi Gorji
    Ole-Christoffer Granmo
    Applied Intelligence, 2023, 53 : 8596 - 8613