Multi-player H∞ Differential Game using On-Policy and Off-Policy Reinforcement Learning

被引：0

作者：

An, Peiliang ^{[1
]}

Liu, Mushuang ^{[1
]}

Wan, Yan ^{[1
]}

Lewis, Frank L. ^{[2
]}

机构：

[1] Univ Texas Arlington, Dept Elect Engn, Arlington, TX 76019 USA

[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX USA

来源：

2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA) | 2020年

基金：

美国国家科学基金会;

关键词：

TRACKING CONTROL; TIME-SYSTEMS; ALGORITHMS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper studies a multi-player H-infinity differential game for systems of general linear dynamics. In this game, multiple players design their control inputs to minimize their cost functions in the presence of worst-case disturbances. We first derive the optimal control and disturbance policies using the solutions to Hamilton-Jacobi-Isaacs (HJI) equations. We then prove that the derived optimal policies stabilize the system and constitute a Nash equilibrium solution. Two integral reinforcement learning (IRL) -based algorithms, including the policy iteration IRL and off-policy IRL, are developed to solve the differential game online. We show that the off-policy IRL can solve the multi-player H-infinity differential game online without using any system dynamics information. Simulation studies are conducted to validate the theoretical analysis and demonstrate the effectiveness of the developed learning algorithms.

引用

页码：1137 / 1142

页数：6

共 50 条

[41] Off-Policy Deep Reinforcement Learning without Exploration
Fujimoto, Scott
Meger, David
Precup, Doina
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[42] Mixed experience sampling for off-policy reinforcement learning
Yu, Jiayu
Li, Jingyao
Lu, Shuai
Han, Shuai
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
[43] Research on Off-Policy Evaluation in Reinforcement Learning: A Survey
Wang S.-R.
Niu W.-J.
Tong E.-D.
Chen T.
Li H.
Tian Y.-Z.
Liu J.-Q.
Han Z.
Li Y.-D.
Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (09): : 1926 - 1945
[44] Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective
Zhang, Zeyu
Su, Yi
Yuan, Hui
Wu, Yiran
Balasubramanian, Rishab
Wu, Qingyun
Wang, Huazheng
Wang, Mengdi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[45] Hyperparameter Tuning of an Off-Policy Reinforcement Learning Algorithm for H∞ Tracking Control
Farahmandi, Alireza
Reitz, Brian
Debord, Mark
Philbrick, Douglas
Estabridis, Katia
Hewer, Gary
LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
[46] Off-policy synchronous iteration IRL method for multi-player zero-sum games with input constraints
Ren, He
Zhang, Huaguang
Mu, Yunfei
Duan, Jie
NEUROCOMPUTING, 2020, 378 : 413 - 421
[47] Comparison of On-Policy Deep Reinforcement Learning A2C with Off-Policy DQN in Irrigation Optimization: A Case Study at a Site in Portugal
Alibabaei, Khadijeh
Gaspar, Pedro D.
Assuncao, Eduardo
Alirezazadeh, Saeid
Lima, Tania M.
Soares, Vasco N. G. J.
Caldeira, Joao M. L. P.
COMPUTERS, 2022, 11 (07)
[48] Off-policy synchronous iteration IRL method for multi-player zero-sum games with input constraints
Ren, He
Zhang, Huaguang
Mu, Yunfei
Duan, Jie
Neurocomputing, 2021, 378 : 413 - 421
[49] Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus
Zhang, Yan
Zavlanos, Michael M.
2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 4674 - 4679
[50] Off-policy Q-learning: Solving Nash equilibrium of multi-player games with network-induced delay and unmeasured state
Li, Jinna
Xiao, Zhenfei
Fan, Jialu
Chai, Tianyou
Lewis, Frank L. L.
AUTOMATICA, 2022, 136

← 1 2 3 4 5 →