Inverse Reinforcement Learning for Adversarial Apprentice Games

被引：26

作者：

Lian, Bosen ^{[1
]}

Xue, Wenqian ^{[2
]}

Lewis, Frank L. ^{[1
]}

Chai, Tianyou ^{[2
]}

机构：

[1] Univ Texas Arlington, Res Inst, Ft Worth, TX 76118 USA

[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind & Int Join, Shenyang 110819, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 08期

关键词：

Games; Cost function; Optimal control; Heuristic algorithms; Costs; Artificial neural networks; System dynamics; Adversarial games; apprentice games; inverse optimal control; inverse reinforcement learning (RL); neural networks (NNs); optimal control;

D O I：

10.1109/TNNLS.2021.3114612

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article proposes new inverse reinforcement learning (RL) algorithms to solve our defined Adversarial Apprentice Games for nonlinear learner and expert systems. The games are solved by extracting the unknown cost function of an expert by a learner using demonstrated expert's behaviors. We first develop a model-based inverse RL algorithm that consists of two learning stages: an optimal control learning and a second learning based on inverse optimal control. This algorithm also clarifies the relationships between inverse RL and inverse optimal control. Then, we propose a new model-free integral inverse RL algorithm to reconstruct the unknown expert cost function. The model-free algorithm only needs online demonstration of the expert and learner's trajectory data without knowing system dynamics of either the learner or the expert. These two algorithms are further implemented using neural networks (NNs). In Adversarial Apprentice Games, the learner and the expert are allowed to suffer from different adversarial attacks in the learning process. A two-player zero-sum game is formulated for each of these two agents and is solved as a subproblem for the learner in inverse RL. Furthermore, it is shown that the cost functions that the learner learns to mimic the expert's behavior are stabilizing and not unique. Finally, simulations and comparisons show the effectiveness and the superiority of the proposed algorithms.

引用

页码：4596 / 4609

页数：14

共 50 条

[1] Optimal Robust Formation of Multi-Agent Systems as Adversarial Graphical Apprentice Games With Inverse Reinforcement Learning
Golmisheh, Fatemeh Mahdavi
Shamaghdari, Saeed
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, : 1 - 14
[2] Inverse reinforcement learning for multi-player noncooperative apprentice games
Lian, Bosen
Xue, Wenqian
Lewis, Frank L.
Chai, Tianyou
AUTOMATICA, 2022, 145
[3] Efficient Reinforcement Learning in Adversarial Games
Skoulakis, Ioannis E.
Lagoudakis, Michail G.
2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 704 - 711
[4] Multiagent Adversarial Inverse Reinforcement Learning
Wei, Ermo
Wicke, Drew
Luke, Sean
AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2265 - 2266
[5] Hierarchical Adversarial Inverse Reinforcement Learning
Chen, Jiayu
Lan, Tian
Aggarwal, Vaneet
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17549 - 17558
[6] Inverse Reinforcement Learning for Multi-player Apprentice Games in Continuous-Time Nonlinear Systems
Lian, Bosen
Xue, Wenqian
Lewis, Frank L.
Chai, Tianyou
Davoudi, Ali
2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 803 - 808
[7] Interactive Inverse Reinforcement Learning for Cooperative Games
Buning, Thomas Kleine
George, Anne-Marie
Dimitrakakis, Christos
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[8] Multiagent Graphical Games With Inverse Reinforcement Learning
Donge, Vrushabh S.
Lian, Bosen
Lewis, Frank L.
Davoudi, Ali
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (02): : 841 - 852
[9] Learning Aircraft Pilot Skills by Adversarial Inverse Reinforcement Learning
Suzuki, Kaito
Uemura, Tsuneharu
Tsuchiya, Takeshi
Beppu, Hirofumi
Hazui, Yusuke
Ono, Hitoi
2023 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY, VOL I, APISAT 2023, 2024, 1050 : 1431 - 1441
[10] Multi-Agent Adversarial Inverse Reinforcement Learning
Yu, Lantao
Song, Jiaming
Ermon, Stefano
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97

← 1 2 3 4 5 →