Discovering Latent Variables for the Tasks With Confounders in Multi-Agent Reinforcement Learning

被引:0
|
作者
Jiang, Kun [1 ]
Liu, Wenzhang [2 ]
Wang, Yuanda [1 ]
Dong, Lu [3 ]
Sun, Changyin [1 ,4 ]
机构
[1] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
[2] Anhui Univ, Sch Artificial Intelligence, Hefei 230601, Peoples R China
[3] Southeast Univ, Sch Cyber Sci & Engn, Nanjing 211189, Peoples R China
[4] Minist Educ, Engn Res Ctr Autonomous Unmanned Syst Technol, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Current measurement; Reinforcement learning; Approximation algorithms; Probabilistic logic; Prediction algorithms; Inference algorithms; Latent variable model; maximum entropy; multi-agent reinforcement learning (MARL); multi-agent system;
D O I
10.1109/JAS.2024.124281
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning (MARL). It is significantly more difficult for those tasks with latent variables that agents cannot directly observe. However, most of the existing latent variable discovery methods lack a clear representation of latent variables and an effective evaluation of the influence of latent variables on the agent. In this paper, we propose a new MARL algorithm based on the soft actor-critic method for complex continuous control tasks with confounders. It is called the multi-agent soft actor-critic with latent variable (MASAC-LV) algorithm, which uses variational inference theory to infer the compact latent variables representation space from a large amount of offline experience. Besides, we derive the counterfactual policy whose input has no latent variables and quantify the difference between the actual policy and the counterfactual policy via a distance function. This quantified difference is considered an intrinsic motivation that gives additional rewards based on how much the latent variable affects each agent. The proposed algorithm is evaluated on two collaboration tasks with confounders, and the experimental results demonstrate the effectiveness of MASAC-LV compared to other baseline algorithms.
引用
收藏
页码:1591 / 1604
页数:14
相关论文
共 50 条
  • [21] Multi-agent reinforcement learning: A survey
    Busoniu, Lucian
    Babuska, Robert
    De Schutter, Bart
    [J]. 2006 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1- 5, 2006, : 1133 - +
  • [22] The Dynamics of Multi-Agent Reinforcement Learning
    Dickens, Luke
    Broda, Krysia
    Russo, Alessandra
    [J]. ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2010, 215 : 367 - 372
  • [23] Hierarchical multi-agent reinforcement learning
    Ghavamzadeh, Mohammad
    Mahadevan, Sridhar
    Makar, Rajbala
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2006, 13 (02) : 197 - 229
  • [24] Partitioning in multi-agent reinforcement learning
    Sun, R
    Peterson, T
    [J]. FROM ANIMALS TO ANIMATS 6, 2000, : 325 - 332
  • [25] Multi-Agent Reinforcement Learning for Microgrids
    Dimeas, A. L.
    Hatziargyriou, N. D.
    [J]. IEEE POWER AND ENERGY SOCIETY GENERAL MEETING 2010, 2010,
  • [26] Multi-agent Exploration with Reinforcement Learning
    Sygkounas, Alkis
    Tsipianitis, Dimitris
    Nikolakopoulos, George
    Bechlioulis, Charalampos P.
    [J]. 2022 30TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2022, : 630 - 635
  • [27] Hierarchical multi-agent reinforcement learning for cooperative tasks with sparse rewards in continuous domain
    Cao, Jingyu
    Dong, Lu
    Yuan, Xin
    Wang, Yuanda
    Sun, Changyin
    [J]. NEURAL COMPUTING & APPLICATIONS, 2024, 36 (01): : 273 - 287
  • [28] Hierarchical multi-agent reinforcement learning for cooperative tasks with sparse rewards in continuous domain
    Jingyu Cao
    Lu Dong
    Xin Yuan
    Yuanda Wang
    Changyin Sun
    [J]. Neural Computing and Applications, 2024, 36 : 273 - 287
  • [29] MAGNet: Multi-agent Graph Network for Deep Multi-agent Reinforcement Learning
    Malysheva, Aleksandra
    Kudenko, Daniel
    Shpilman, Aleksei
    [J]. 2019 XVI INTERNATIONAL SYMPOSIUM PROBLEMS OF REDUNDANCY IN INFORMATION AND CONTROL SYSTEMS (REDUNDANCY), 2019, : 171 - 176
  • [30] Heterogeneous Skill Learning for Multi-agent Tasks
    Liu, Yuntao
    Li, Yuan
    Xu, Xinhai
    Dou, Yong
    Liu, Donghong
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,