Multi-player H∞ Differential Game using On-Policy and Off-Policy Reinforcement Learning

被引:0
|
作者
An, Peiliang [1 ]
Liu, Mushuang [1 ]
Wan, Yan [1 ]
Lewis, Frank L. [2 ]
机构
[1] Univ Texas Arlington, Dept Elect Engn, Arlington, TX 76019 USA
[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX USA
基金
美国国家科学基金会;
关键词
TRACKING CONTROL; TIME-SYSTEMS; ALGORITHMS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper studies a multi-player H-infinity differential game for systems of general linear dynamics. In this game, multiple players design their control inputs to minimize their cost functions in the presence of worst-case disturbances. We first derive the optimal control and disturbance policies using the solutions to Hamilton-Jacobi-Isaacs (HJI) equations. We then prove that the derived optimal policies stabilize the system and constitute a Nash equilibrium solution. Two integral reinforcement learning (IRL) -based algorithms, including the policy iteration IRL and off-policy IRL, are developed to solve the differential game online. We show that the off-policy IRL can solve the multi-player H-infinity differential game online without using any system dynamics information. Simulation studies are conducted to validate the theoretical analysis and demonstrate the effectiveness of the developed learning algorithms.
引用
收藏
页码:1137 / 1142
页数:6
相关论文
共 50 条
  • [1] Off-policy and on-policy reinforcement learning with the Tsetlin machine
    Saeed Rahimi Gorji
    Ole-Christoffer Granmo
    Applied Intelligence, 2023, 53 : 8596 - 8613
  • [2] Off-policy and on-policy reinforcement learning with the Tsetlin machine
    Gorji, Saeed Rahimi
    Granmo, Ole-Christoffer
    APPLIED INTELLIGENCE, 2023, 53 (08) : 8596 - 8613
  • [3] Adaptive Optimal Control for Stochastic Multiplayer Differential Games Using On-Policy and Off-Policy Reinforcement Learning
    Liu, Mushuang
    Wan, Yan
    Lewis, Frank L.
    Lopez, Victor G.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5522 - 5533
  • [4] A multi-step on-policy deep reinforcement learning method assisted by off-policy policy evaluation
    Zhang, Huaqing
    Ma, Hongbin
    Mersha, Bemnet Wondimagegnehu
    Jin, Ying
    APPLIED INTELLIGENCE, 2024, 54 (21) : 11144 - 11159
  • [5] Two-player nonlinear Stackelberg differential game via off-policy integral reinforcement learning
    Cui, Xiaohong
    Chen, Jiayu
    Cui, Yang
    Xu, Suan
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2024, 361 (08):
  • [6] Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning Shixiang
    Gu, Shixiang
    Lillicrap, Timothy
    Ghahramani, Zoubin
    Turner, Richard E.
    Scholkopf, Bernhard
    Levine, Sergey
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [7] H∞ Control for Discrete-Time Multi-Player Systems via Off-Policy Q-Learning
    Li, Jinna
    Xiao, Zhenfei
    IEEE ACCESS, 2020, 8 (08): : 28831 - 28846
  • [8] Off-Policy Q-Learning for Anti-Interference Control of Multi-Player Systems
    Li, Jinna
    Xiao, Zhenfei
    Chai, Tianyou
    Lewis, Frank L.
    Jagannathan, Sarangapani
    IFAC PAPERSONLINE, 2020, 53 (02): : 9189 - 9194
  • [9] Discrete-Time Multi-Player Games Based on Off-Policy Q-Learning
    Li, Jinna
    Xiao, Zhenfei
    Li, Ping
    IEEE ACCESS, 2019, 7 : 134647 - 134659
  • [10] Integral reinforcement learning off-policy method for solving nonlinear multi-player nonzero-sum games with saturated actuator
    Ren, He
    Zhang, Huaguang
    Wen, Yinlei
    Liu, Chong
    NEUROCOMPUTING, 2019, 335 : 96 - 104