Multi-player H∞ Differential Game using On-Policy and Off-Policy Reinforcement Learning

被引：0

作者：

An, Peiliang ^{[1
]}

Liu, Mushuang ^{[1
]}

Wan, Yan ^{[1
]}

Lewis, Frank L. ^{[2
]}

机构：

[1] Univ Texas Arlington, Dept Elect Engn, Arlington, TX 76019 USA

[2] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX USA

来源：

2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA) | 2020年

基金：

美国国家科学基金会;

关键词：

TRACKING CONTROL; TIME-SYSTEMS; ALGORITHMS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper studies a multi-player H-infinity differential game for systems of general linear dynamics. In this game, multiple players design their control inputs to minimize their cost functions in the presence of worst-case disturbances. We first derive the optimal control and disturbance policies using the solutions to Hamilton-Jacobi-Isaacs (HJI) equations. We then prove that the derived optimal policies stabilize the system and constitute a Nash equilibrium solution. Two integral reinforcement learning (IRL) -based algorithms, including the policy iteration IRL and off-policy IRL, are developed to solve the differential game online. We show that the off-policy IRL can solve the multi-player H-infinity differential game online without using any system dynamics information. Simulation studies are conducted to validate the theoretical analysis and demonstrate the effectiveness of the developed learning algorithms.

引用

页码：1137 / 1142

页数：6

共 50 条

[21] Safe Off-policy Reinforcement Learning Using Barrier Functions
Marvi, Zahra
Kiumarsi, Bahare
2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 2176 - 2181
[22] Representations for Stable Off-Policy Reinforcement Learning
Ghosh, Dibya
Bellemare, Marc G.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[23] A perspective on off-policy evaluation in reinforcement learning
Li, Lihong
FRONTIERS OF COMPUTER SCIENCE, 2019, 13 (05) : 911 - 912
[24] On the Reuse Bias in Off-Policy Reinforcement Learning
Ying, Chengyang
Hao, Zhongkai
Zhou, Xinning
Su, Hang
Yan, Dong
Zhu, Jun
PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 4513 - 4521
[25] A perspective on off-policy evaluation in reinforcement learning
Lihong Li
Frontiers of Computer Science, 2019, 13 : 911 - 912
[26] Off-Policy Differentiable Logic Reinforcement Learning
Zhang, Li
Li, Xin
Wang, Mingzhong
Tian, Andong
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT II, 2021, 12976 : 617 - 632
[27] Marginalized Operators for Off-policy Reinforcement Learning
Tang, Yunhao
Rowland, Mark
Munos, Remi
Valko, Michal
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151 : 655 - 679
[28] Off-Policy Shaping Ensembles in Reinforcement Learning
Harutyunyan, Anna
Brys, Tim
Vrancx, Peter
Nowe, Ann
21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1021 - 1022
[29] Reliable Off-Policy Evaluation for Reinforcement Learning
Wang, Jie
Gao, Rui
Zha, Hongyuan
OPERATIONS RESEARCH, 2024, 72 (02) : 699 - 716
[30] Sequential Search with Off-Policy Reinforcement Learning
Miao, Dadong
Wang, Yanan
Tang, Guoyu
Liu, Lin
Xu, Sulong
Long, Bo
Xiao, Yun
Wu, Lingfei
Jiang, Yunjiang
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 4006 - 4015

← 1 2 3 4 5 →