Optimal Synchronization Control of Multiagent Systems With Input Saturation via Off-Policy Reinforcement Learning

被引:102
|
作者
Qin, Jiahu [1 ]
Li, Man [1 ]
Shi, Yang [2 ]
Ma, Qichao [1 ]
Zheng, Wei Xing [3 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei 230027, Anhui, Peoples R China
[2] Univ Victoria, Dept Mech Engn, Victoria, BC V8W 2Y2, Canada
[3] Western Sydney Univ, Sch Comp Engn & Math, Sydney, NSW 2751, Australia
基金
澳大利亚研究理事会; 中国国家自然科学基金;
关键词
Input saturation; multiagent systems; neural networks (NNs); off-policy reinforcement learning (RL); optimal synchronization control; LINEAR-SYSTEMS; NONLINEAR-SYSTEMS; NETWORKS; GAMES;
D O I
10.1109/TNNLS.2018.2832025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we aim to investigate the optimal synchronization problem for a group of generic linear systems with input saturation. To seek the optimal controller, Hamilton Jacobi-Bellman (HJB) equations involving nonquadratic input energy terms in coupled forms are established. The solutions to these coupled HJB equations are further proven to be optimal and the induced controllers constitute interactive Nash equilibrium. Due to the difficulty to analytically solve HJB equations, especially in coupled forms, and the possible lack of model information of the systems, we apply the data-based off-policy reinforcement learning algorithm to learn the optimal control policies. A byproduct of this off-policy algorithm is shown that it is insensitive to probing noise that is exerted to the system to maintain persistence of excitation condition. In order to implement this off-policy algorithm, we employ actor and critic neural networks to approximate the controllers and the cost functions. Furthermore, the estimated control policies obtained by this presented implementation are proven to converge to the optimal ones under certain conditions. Finally, an illustrative example is provided to verify the effectiveness of the proposed algorithm.
引用
收藏
页码:85 / 96
页数:12
相关论文
共 50 条
  • [31] A perspective on off-policy evaluation in reinforcement learning
    Li, Lihong
    FRONTIERS OF COMPUTER SCIENCE, 2019, 13 (05) : 911 - 912
  • [32] Representations for Stable Off-Policy Reinforcement Learning
    Ghosh, Dibya
    Bellemare, Marc G.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [33] A perspective on off-policy evaluation in reinforcement learning
    Lihong Li
    Frontiers of Computer Science, 2019, 13 : 911 - 912
  • [34] On the Reuse Bias in Off-Policy Reinforcement Learning
    Ying, Chengyang
    Hao, Zhongkai
    Zhou, Xinning
    Su, Hang
    Yan, Dong
    Zhu, Jun
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 4513 - 4521
  • [35] Adaptive Optimal Control for Stochastic Multiplayer Differential Games Using On-Policy and Off-Policy Reinforcement Learning
    Liu, Mushuang
    Wan, Yan
    Lewis, Frank L.
    Lopez, Victor G.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5522 - 5533
  • [36] Off-Policy Shaping Ensembles in Reinforcement Learning
    Harutyunyan, Anna
    Brys, Tim
    Vrancx, Peter
    Nowe, Ann
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1021 - 1022
  • [37] Marginalized Operators for Off-policy Reinforcement Learning
    Tang, Yunhao
    Rowland, Mark
    Munos, Remi
    Valko, Michal
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 655 - 679
  • [38] Off-Policy Differentiable Logic Reinforcement Learning
    Zhang, Li
    Li, Xin
    Wang, Mingzhong
    Tian, Andong
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT II, 2021, 12976 : 617 - 632
  • [39] Reliable Off-Policy Evaluation for Reinforcement Learning
    Wang, Jie
    Gao, Rui
    Zha, Hongyuan
    OPERATIONS RESEARCH, 2024, 72 (02) : 699 - 716
  • [40] Representations for Stable Off-Policy Reinforcement Learning
    Ghosh, Dibya
    Bellemare, Marc G.
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,