Multi-player H∞ Differential Game using On-Policy and Off-Policy Reinforcement Learning

被引:0
|
作者
An, Peiliang [1 ]
Liu, Mushuang [1 ]
Wan, Yan [1 ]
Lewis, Frank L. [2 ]
机构
[1] Univ Texas Arlington, Dept Elect Engn, Arlington, TX 76019 USA
[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX USA
基金
美国国家科学基金会;
关键词
TRACKING CONTROL; TIME-SYSTEMS; ALGORITHMS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper studies a multi-player H-infinity differential game for systems of general linear dynamics. In this game, multiple players design their control inputs to minimize their cost functions in the presence of worst-case disturbances. We first derive the optimal control and disturbance policies using the solutions to Hamilton-Jacobi-Isaacs (HJI) equations. We then prove that the derived optimal policies stabilize the system and constitute a Nash equilibrium solution. Two integral reinforcement learning (IRL) -based algorithms, including the policy iteration IRL and off-policy IRL, are developed to solve the differential game online. We show that the off-policy IRL can solve the multi-player H-infinity differential game online without using any system dynamics information. Simulation studies are conducted to validate the theoretical analysis and demonstrate the effectiveness of the developed learning algorithms.
引用
收藏
页码:1137 / 1142
页数:6
相关论文
共 50 条
  • [31] Representations for Stable Off-Policy Reinforcement Learning
    Ghosh, Dibya
    Bellemare, Marc G.
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [32] Output Feedback H∞ Control for Linear Discrete-Time Multi-Player Systems With Multi-Source Disturbances Using Off-Policy Q-Learning
    Xiao, Zhenfei
    Li, Jinna
    Li, Ping
    IEEE ACCESS, 2020, 8 : 208938 - 208951
  • [33] Off-Policy Model-Free Learning for Multi-Player Non-Zero-Sum Games With Constrained Inputs
    Huo, Yu
    Wang, Ding
    Qiao, Junfei
    Li, Menghua
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (02) : 910 - 920
  • [34] Learning Routines for Effective Off-Policy Reinforcement Learning
    Cetin, Edoardo
    Celiktutan, Oya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [35] Off-Policy Action Anticipation in Multi-Agent Reinforcement Learning
    Bighashdel, Ariyan
    de Geus, Daan
    Jancura, Pavol
    Dubbelman, Gijs
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [36] Optimal Control for Multi-agent Systems Using Off-Policy Reinforcement Learning
    Wang, Hao
    Chen, Zhiru
    Wang, Jun
    Lu, Lijun
    Li, Mingzhe
    2022 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS, ICCR, 2022, : 135 - 140
  • [37] Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
    Thomas, Philip S.
    Brunskill, Emma
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [38] On-policy concurrent reinforcement learning
    Banerjee, B
    Sen, S
    Peng, J
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2004, 16 (04) : 245 - 260
  • [39] Counterfactual experience augmented off-policy reinforcement learning
    Lee, Sunbowen
    Gong, Yicheng
    Deng, Chao
    NEUROCOMPUTING, 2025, 637
  • [40] Flexible Data Augmentation in Off-Policy Reinforcement Learning
    Rak, Alexandra
    Skrynnik, Alexey
    Panov, Aleksandr I.
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2021), PT I, 2021, 12854 : 224 - 235