VMAPD: Generate Diverse Solutions for Multi-Agent Games with Recurrent Trajectory Discriminators

被引:1
|
作者
Huang, Shiyu [1 ]
Yu, Chao [1 ]
Wang, Bin [2 ]
Li, Dong [2 ]
Wang, Yu [1 ]
Chen, Ting [1 ]
Zhu, Jun [1 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Huawei Noahs Ark Lab, Beijing, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
deep reinforcement learning; multi-agent reinforcement learning; diversity; probabilistic graphical models;
D O I
10.1109/CoG51982.2022.9893722
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Recent algorithms designed for multi-agent tasks focus on finding a single optimal solution for all the agents. However, in many tasks (e.g., matrix games and transportation dispatching), there may exist more than one optimal solution, while previous algorithms can only converge to one of them. In many practical applications, it is important to develop reasonable agents with diverse behaviors. In this paper, we propose "variational multi-agent policy diversification" (VMAPD), an on-policy framework for discovering diverse policies for coordination patterns of multiple agents. By taking advantage of latent variables and exploiting the connection between variational inference and multi-agent reinforcement learning, we derive a tractable evidence lower bound (ELBO) on the trajectories of all agents. Our algorithm uses policy iteration to maximize the derived lower bound and can be simply implemented by adding a pseudo reward during centralized learning. And the trained agents do not need to access the pseudo reward during decentralized execution. We demonstrate the effectiveness of our algorithm on several popular multi-agent testbeds. Experimental results show that VMAPD finds more solutions with similar sample complexity compared with other baselines.
引用
收藏
页码:9 / 16
页数:8
相关论文
共 50 条
  • [1] Diverse Generation for Multi-agent Sports Games
    Yeh, Raymond A.
    Schwing, Alexander G.
    Huang, Jonathan
    Murphy, Kevin
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4605 - 4614
  • [2] Efficient Constrained Multi-Agent Trajectory Optimization using Dynamic Potential Games
    Bhatt, Maulik
    Jia, Yixuan
    Mehr, Negar
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 7303 - 7310
  • [3] Multi-agent based pedagogical games
    Giraffa, LMM
    Viccari, RM
    Self, J
    INTELLIGENT TUTORING SYSTEMS, 1998, 1452 : 607 - 607
  • [4] Multi-Agent Flag Coordination Games
    Marzagao, David Kohan
    Rivera, Nicolas
    Cooper, Colin
    McBurney, Peter
    Steinhofel, Kathleen
    AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 1442 - 1450
  • [5] Contingency Games for Multi-Agent Interaction
    Peters, Lasse
    Bajcsy, Andrea
    Chiu, Chih-Yuan
    Fridovich-Keil, David
    Laine, Forrest
    Ferranti, Laura
    Alonso-Mora, Javier
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (03) : 2208 - 2215
  • [6] Multi-Agent Petri-Games
    Tagiew, Rustam
    2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING CONTROL & AUTOMATION, VOLS 1 AND 2, 2008, : 130 - 135
  • [7] Multi-agent Differential Graphical Games
    Vamvoudakis, Kyriakos G.
    Lewis, F. L.
    2011 30TH CHINESE CONTROL CONFERENCE (CCC), 2011, : 4932 - 4939
  • [8] Solving multi-agent games on networks
    Vaknin, Yair
    Meisels, Amnon
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2025, 39 (01)
  • [9] Mainstream games in the multi-agent classroom
    de Melo, Celso
    Prada, Rui
    Raimundo, Guilherme
    Pardal, Joana Paulo
    Pinto, Helena Sofia
    Paiva, Ana
    2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON INTELLIGENT AGENT TECHNOLOGY, PROCEEDINGS, 2006, : 757 - +
  • [10] A Generic Agent Architecture for Cooperative Multi-agent Games
    Marinheiro, Joao
    Cardoso, Henrique Lopes
    ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1, 2017, : 107 - 118