Inverse Reinforcement Learning for Adversarial Apprentice Games

被引:26
|
作者
Lian, Bosen [1 ]
Xue, Wenqian [2 ]
Lewis, Frank L. [1 ]
Chai, Tianyou [2 ]
机构
[1] Univ Texas Arlington, Res Inst, Ft Worth, TX 76118 USA
[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind & Int Join, Shenyang 110819, Peoples R China
关键词
Games; Cost function; Optimal control; Heuristic algorithms; Costs; Artificial neural networks; System dynamics; Adversarial games; apprentice games; inverse optimal control; inverse reinforcement learning (RL); neural networks (NNs); optimal control;
D O I
10.1109/TNNLS.2021.3114612
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article proposes new inverse reinforcement learning (RL) algorithms to solve our defined Adversarial Apprentice Games for nonlinear learner and expert systems. The games are solved by extracting the unknown cost function of an expert by a learner using demonstrated expert's behaviors. We first develop a model-based inverse RL algorithm that consists of two learning stages: an optimal control learning and a second learning based on inverse optimal control. This algorithm also clarifies the relationships between inverse RL and inverse optimal control. Then, we propose a new model-free integral inverse RL algorithm to reconstruct the unknown expert cost function. The model-free algorithm only needs online demonstration of the expert and learner's trajectory data without knowing system dynamics of either the learner or the expert. These two algorithms are further implemented using neural networks (NNs). In Adversarial Apprentice Games, the learner and the expert are allowed to suffer from different adversarial attacks in the learning process. A two-player zero-sum game is formulated for each of these two agents and is solved as a subproblem for the learner in inverse RL. Furthermore, it is shown that the cost functions that the learner learns to mimic the expert's behavior are stabilizing and not unique. Finally, simulations and comparisons show the effectiveness and the superiority of the proposed algorithms.
引用
收藏
页码:4596 / 4609
页数:14
相关论文
共 50 条
  • [1] Optimal Robust Formation of Multi-Agent Systems as Adversarial Graphical Apprentice Games With Inverse Reinforcement Learning
    Golmisheh, Fatemeh Mahdavi
    Shamaghdari, Saeed
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, : 1 - 14
  • [2] Inverse reinforcement learning for multi-player noncooperative apprentice games
    Lian, Bosen
    Xue, Wenqian
    Lewis, Frank L.
    Chai, Tianyou
    AUTOMATICA, 2022, 145
  • [3] Efficient Reinforcement Learning in Adversarial Games
    Skoulakis, Ioannis E.
    Lagoudakis, Michail G.
    2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 704 - 711
  • [4] Multiagent Adversarial Inverse Reinforcement Learning
    Wei, Ermo
    Wicke, Drew
    Luke, Sean
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2265 - 2266
  • [5] Hierarchical Adversarial Inverse Reinforcement Learning
    Chen, Jiayu
    Lan, Tian
    Aggarwal, Vaneet
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17549 - 17558
  • [6] Inverse Reinforcement Learning for Multi-player Apprentice Games in Continuous-Time Nonlinear Systems
    Lian, Bosen
    Xue, Wenqian
    Lewis, Frank L.
    Chai, Tianyou
    Davoudi, Ali
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 803 - 808
  • [7] Interactive Inverse Reinforcement Learning for Cooperative Games
    Buning, Thomas Kleine
    George, Anne-Marie
    Dimitrakakis, Christos
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [8] Multiagent Graphical Games With Inverse Reinforcement Learning
    Donge, Vrushabh S.
    Lian, Bosen
    Lewis, Frank L.
    Davoudi, Ali
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (02): : 841 - 852
  • [9] Learning Aircraft Pilot Skills by Adversarial Inverse Reinforcement Learning
    Suzuki, Kaito
    Uemura, Tsuneharu
    Tsuchiya, Takeshi
    Beppu, Hirofumi
    Hazui, Yusuke
    Ono, Hitoi
    2023 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON AEROSPACE TECHNOLOGY, VOL I, APISAT 2023, 2024, 1050 : 1431 - 1441
  • [10] Multi-Agent Adversarial Inverse Reinforcement Learning
    Yu, Lantao
    Song, Jiaming
    Ermon, Stefano
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97