Multi-player H∞ Differential Game using On-Policy and Off-Policy Reinforcement Learning

被引:0
|
作者
An, Peiliang [1 ]
Liu, Mushuang [1 ]
Wan, Yan [1 ]
Lewis, Frank L. [2 ]
机构
[1] Univ Texas Arlington, Dept Elect Engn, Arlington, TX 76019 USA
[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX USA
基金
美国国家科学基金会;
关键词
TRACKING CONTROL; TIME-SYSTEMS; ALGORITHMS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper studies a multi-player H-infinity differential game for systems of general linear dynamics. In this game, multiple players design their control inputs to minimize their cost functions in the presence of worst-case disturbances. We first derive the optimal control and disturbance policies using the solutions to Hamilton-Jacobi-Isaacs (HJI) equations. We then prove that the derived optimal policies stabilize the system and constitute a Nash equilibrium solution. Two integral reinforcement learning (IRL) -based algorithms, including the policy iteration IRL and off-policy IRL, are developed to solve the differential game online. We show that the off-policy IRL can solve the multi-player H-infinity differential game online without using any system dynamics information. Simulation studies are conducted to validate the theoretical analysis and demonstrate the effectiveness of the developed learning algorithms.
引用
收藏
页码:1137 / 1142
页数:6
相关论文
共 50 条
  • [41] Off-Policy Deep Reinforcement Learning without Exploration
    Fujimoto, Scott
    Meger, David
    Precup, Doina
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [42] Mixed experience sampling for off-policy reinforcement learning
    Yu, Jiayu
    Li, Jingyao
    Lu, Shuai
    Han, Shuai
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
  • [43] Research on Off-Policy Evaluation in Reinforcement Learning: A Survey
    Wang S.-R.
    Niu W.-J.
    Tong E.-D.
    Chen T.
    Li H.
    Tian Y.-Z.
    Liu J.-Q.
    Han Z.
    Li Y.-D.
    Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (09): : 1926 - 1945
  • [44] Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective
    Zhang, Zeyu
    Su, Yi
    Yuan, Hui
    Wu, Yiran
    Balasubramanian, Rishab
    Wu, Qingyun
    Wang, Huazheng
    Wang, Mengdi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [45] Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control
    Farahmandi, Alireza
    Reitz, Brian
    Debord, Mark
    Philbrick, Douglas
    Estabridis, Katia
    Hewer, Gary
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [46] Off-policy synchronous iteration IRL method for multi-player zero-sum games with input constraints
    Ren, He
    Zhang, Huaguang
    Mu, Yunfei
    Duan, Jie
    NEUROCOMPUTING, 2020, 378 : 413 - 421
  • [47] Comparison of On-Policy Deep Reinforcement Learning A2C with Off-Policy DQN in Irrigation Optimization: A Case Study at a Site in Portugal
    Alibabaei, Khadijeh
    Gaspar, Pedro D.
    Assuncao, Eduardo
    Alirezazadeh, Saeid
    Lima, Tania M.
    Soares, Vasco N. G. J.
    Caldeira, Joao M. L. P.
    COMPUTERS, 2022, 11 (07)
  • [48] Off-policy synchronous iteration IRL method for multi-player zero-sum games with input constraints
    Ren, He
    Zhang, Huaguang
    Mu, Yunfei
    Duan, Jie
    Neurocomputing, 2021, 378 : 413 - 421
  • [49] Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus
    Zhang, Yan
    Zavlanos, Michael M.
    2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 4674 - 4679
  • [50] Off-policy Q-learning: Solving Nash equilibrium of multi-player games with network-induced delay and unmeasured state
    Li, Jinna
    Xiao, Zhenfei
    Fan, Jialu
    Chai, Tianyou
    Lewis, Frank L. L.
    AUTOMATICA, 2022, 136