Maximum-Entropy Multi-Agent Dynamic Games: Forward and Inverse Solutions

被引：16

作者：

Mehr, Negar ^{[1
]}

Wang, Mingyu ^{[2
]}

Bhatt, Maulik ^{[3
]}

Schwager, Mac ^{[4
]}

机构：

[1] Univ Illinois, Aerosp Engn Dept, Urbana, IL 61801 USA

[2] Stanford Univ, Dept Mech Engn, Stanford, CA 94305 USA

[3] Univ Illinois, Dept Aerosp Engn, Urbana, IL 61801 USA

[4] Stanford Univ, Dept Aeronaut & Astronaut, Stanford, CA 94305 USA

来源：

IEEE TRANSACTIONS ON ROBOTICS | 2023年 / 39卷 / 03期

基金：

美国国家科学基金会;

关键词：

Games; Costs; Cost function; Behavioral sciences; Noise measurement; Entropy; Nash equilibrium; Game-theoretic interactions; inverse reinforcement learning (IRL); learning from demonstration; multi-agent systems; IDENTIFICATION;

D O I：

10.1109/TRO.2022.3232300

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

In this article, we study the problem of multiple stochastic agents interacting in a dynamic game scenario with continuous state and action spaces. We define a new notion of stochastic Nash equilibrium for boundedly rational agents, which we call the entropic cost equilibrium (ECE). We show that ECE is a natural extension to multiple agents of maximum entropy optimality for a single agent. We solve both the "forward " and "inverse " problems for the multi-agent ECE game. For the forward problem, we provide a Riccati algorithm to compute closed-form ECE feedback policies for the agents, which are exact in the linear-quadratic-gaussian case. We give an iterative variant to find locally ECE feedback policies for the nonlinear case. For the inverse problem, we present an algorithm to infer the cost functions of the multiple interacting agents given noisy, boundedly rational input and state trajectory examples from agents acting in an ECE. The effectiveness of our algorithms is demonstrated in a simulated multi-agent collision avoidance scenario, and with data from the INTERACTION traffic dataset. In both cases, we show that, by taking into account the agents' game theoretic interactions using our algorithm, a more accurate model of agents' costs can be learned, compared with standard inverse optimal control methods.

引用

页码：1801 / 1815

页数：15

共 50 条

[41] Exposing Transmitters in Mobile Multi-Agent Games
Bessos, Mai Ben-Adar
Birnbach, Simon
Herzberg, Amir
Martinovic, Ivan
CPS-SPC'16: PROCEEDINGS OF THE 2ND ACM WORKSHOP ON CYBER-PHYSICAL SYSTEMS SECURITY & PRIVACY, 2016, : 125 - 136
[42] Multi-agent algorithms for solving graphical games
Vickrey, D
Koller, D
EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 345 - 351
[43] Diverse Generation for Multi-agent Sports Games
Yeh, Raymond A.
Schwing, Alexander G.
Huang, Jonathan
Murphy, Kevin
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4605 - 4614
[44] MAXIMUM-ENTROPY MODELING FOR IDENTIFICATION AND DETECTION OF CERTAIN CLASSES OF DYNAMIC EVENTS
MACINA, NA
RCA REVIEW, 1981, 42 (01): : 85 - 110
[45] Multi-Agent Adversarial Inverse Reinforcement Learning
Yu, Lantao
Song, Jiaming
Ermon, Stefano
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[46] EVOLUTIONARY MULTI-AGENT COMPUTING IN INVERSE PROBLEMS
Wrobel, Krzysztof
Torba, Pawel
Paszynski, Maciej
Byrski, Aleksander
COMPUTER SCIENCE-AGH, 2013, 14 (03): : 367 - 383
[47] The Energy and the Entropy of Hybrid Multi-Agent Systems
Iantovics, Barna
Nichita, Florin F.
PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL LEARNING, ICVL 2011, 2011, : 391 - 394
[48] APPLICATION OF THE MAXIMUM-ENTROPY METHOD TO THE INVERSE-POLE-FIGURE DETERMINATION OF CUBIC MATERIALS
WANG, F
XU, JZ
LIANG, Z
JOURNAL OF APPLIED CRYSTALLOGRAPHY, 1991, 24 : 126 - 128
[49] Optimal Robust Formation of Multi-Agent Systems as Adversarial Graphical Apprentice Games With Inverse Reinforcement Learning
Golmisheh, Fatemeh Mahdavi
Shamaghdari, Saeed
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, : 1 - 14
[50] Entropy regularized actor-critic based multi-agent deep reinforcement learning for stochastic games
Hao, Dong
Zhang, Dongcheng
Shi, Qi
Li, Kai
Information Sciences, 2022, 617 : 17 - 40

← 1 2 3 4 5 →