Multi-player H∞ Differential Game using On-Policy and Off-Policy Reinforcement Learning

被引：0

作者：

An, Peiliang ^{[1
]}

Liu, Mushuang ^{[1
]}

Wan, Yan ^{[1
]}

Lewis, Frank L. ^{[2
]}

机构：

[1] Univ Texas Arlington, Dept Elect Engn, Arlington, TX 76019 USA

[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX USA

来源：

2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA) | 2020年

基金：

美国国家科学基金会;

关键词：

TRACKING CONTROL; TIME-SYSTEMS; ALGORITHMS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper studies a multi-player H-infinity differential game for systems of general linear dynamics. In this game, multiple players design their control inputs to minimize their cost functions in the presence of worst-case disturbances. We first derive the optimal control and disturbance policies using the solutions to Hamilton-Jacobi-Isaacs (HJI) equations. We then prove that the derived optimal policies stabilize the system and constitute a Nash equilibrium solution. Two integral reinforcement learning (IRL) -based algorithms, including the policy iteration IRL and off-policy IRL, are developed to solve the differential game online. We show that the off-policy IRL can solve the multi-player H-infinity differential game online without using any system dynamics information. Simulation studies are conducted to validate the theoretical analysis and demonstrate the effectiveness of the developed learning algorithms.

引用

页码：1137 / 1142

页数：6

共 50 条

[1] Off-policy and on-policy reinforcement learning with the Tsetlin machine
Saeed Rahimi Gorji
Ole-Christoffer Granmo
Applied Intelligence, 2023, 53 : 8596 - 8613
[2] Off-policy and on-policy reinforcement learning with the Tsetlin machine
Gorji, Saeed Rahimi
Granmo, Ole-Christoffer
APPLIED INTELLIGENCE, 2023, 53 (08) : 8596 - 8613
[3] Adaptive Optimal Control for Stochastic Multiplayer Differential Games Using On-Policy and Off-Policy Reinforcement Learning
Liu, Mushuang
Wan, Yan
Lewis, Frank L.
Lopez, Victor G.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5522 - 5533
[4] A multi-step on-policy deep reinforcement learning method assisted by off-policy policy evaluation
Zhang, Huaqing
Ma, Hongbin
Mersha, Bemnet Wondimagegnehu
Jin, Ying
APPLIED INTELLIGENCE, 2024, 54 (21) : 11144 - 11159
[5] Two-player nonlinear Stackelberg differential game via off-policy integral reinforcement learning
Cui, Xiaohong
Chen, Jiayu
Cui, Yang
Xu, Suan
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2024, 361 (08):
[6] Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning Shixiang
Gu, Shixiang
Lillicrap, Timothy
Ghahramani, Zoubin
Turner, Richard E.
Scholkopf, Bernhard
Levine, Sergey
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[7] H∞ Control for Discrete-Time Multi-Player Systems via Off-Policy Q-Learning
Li, Jinna
Xiao, Zhenfei
IEEE ACCESS, 2020, 8 (08): : 28831 - 28846
[8] Off-Policy Q-Learning for Anti-Interference Control of Multi-Player Systems
Li, Jinna
Xiao, Zhenfei
Chai, Tianyou
Lewis, Frank L.
Jagannathan, Sarangapani
IFAC PAPERSONLINE, 2020, 53 (02): : 9189 - 9194
[9] Discrete-Time Multi-Player Games Based on Off-Policy Q-Learning
Li, Jinna
Xiao, Zhenfei
Li, Ping
IEEE ACCESS, 2019, 7 : 134647 - 134659
[10] Integral reinforcement learning off-policy method for solving nonlinear multi-player nonzero-sum games with saturated actuator
Ren, He
Zhang, Huaguang
Wen, Yinlei
Liu, Chong
NEUROCOMPUTING, 2019, 335 : 96 - 104

← 1 2 3 4 5 →