Discovering Latent Variables for the Tasks With Confounders in Multi-Agent Reinforcement Learning

被引:0
|
作者
Jiang, Kun [1 ]
Liu, Wenzhang [2 ]
Wang, Yuanda [1 ]
Dong, Lu [3 ]
Sun, Changyin [1 ,4 ]
机构
[1] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
[2] Anhui Univ, Sch Artificial Intelligence, Hefei 230601, Peoples R China
[3] Southeast Univ, Sch Cyber Sci & Engn, Nanjing 211189, Peoples R China
[4] Minist Educ, Engn Res Ctr Autonomous Unmanned Syst Technol, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Current measurement; Reinforcement learning; Approximation algorithms; Probabilistic logic; Prediction algorithms; Inference algorithms; Latent variable model; maximum entropy; multi-agent reinforcement learning (MARL); multi-agent system;
D O I
10.1109/JAS.2024.124281
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Efficient exploration in complex coordination tasks has been considered a challenging problem in multi-agent reinforcement learning (MARL). It is significantly more difficult for those tasks with latent variables that agents cannot directly observe. However, most of the existing latent variable discovery methods lack a clear representation of latent variables and an effective evaluation of the influence of latent variables on the agent. In this paper, we propose a new MARL algorithm based on the soft actor-critic method for complex continuous control tasks with confounders. It is called the multi-agent soft actor-critic with latent variable (MASAC-LV) algorithm, which uses variational inference theory to infer the compact latent variables representation space from a large amount of offline experience. Besides, we derive the counterfactual policy whose input has no latent variables and quantify the difference between the actual policy and the counterfactual policy via a distance function. This quantified difference is considered an intrinsic motivation that gives additional rewards based on how much the latent variable affects each agent. The proposed algorithm is evaluated on two collaboration tasks with confounders, and the experimental results demonstrate the effectiveness of MASAC-LV compared to other baseline algorithms.
引用
收藏
页码:1591 / 1604
页数:14
相关论文
共 50 条
  • [41] Multi-agent Reinforcement Learning for Service Composition
    Lei, Yu
    Yu, Philip S.
    [J]. PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (SCC 2016), 2016, : 790 - 793
  • [42] Reinforcement learning of multi-agent communicative acts
    Hoet S.
    Sabouret N.
    [J]. Revue d'Intelligence Artificielle, 2010, 24 (02) : 159 - 188
  • [43] Multi-agent Reinforcement Learning in Network Management
    Bagnasco, Ricardo
    Serrat, Joan
    [J]. SCALABILITY OF NETWORKS AND SERVICES, PROCEEDINGS, 2009, 5637 : 199 - 202
  • [44] Multi-agent reinforcement learning with adaptive mimetism
    Yamaguchi, T
    Miura, M
    Yachida, M
    [J]. ETFA '96 - 1996 IEEE CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION, PROCEEDINGS, VOLS 1 AND 2, 1996, : 288 - 294
  • [45] Multi-agent reinforcement learning for character control
    Li, Cheng
    Fussell, Levi
    Komura, Taku
    [J]. VISUAL COMPUTER, 2021, 37 (12): : 3115 - 3123
  • [46] HALFTONING WITH MULTI-AGENT DEEP REINFORCEMENT LEARNING
    Jiang, Haitian
    Xiong, Dongliang
    Jiang, Xiaowen
    Yin, Aiguo
    Ding, Li
    Huang, Kai
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 641 - 645
  • [47] Deep reinforcement learning for multi-agent interaction
    Ahmed, Ibrahim H.
    Brewitt, Cillian
    Carlucho, Ignacio
    Christianos, Filippos
    Dunion, Mhairi
    Fosong, Elliot
    Garcin, Samuel
    Guo, Shangmin
    Gyevnar, Balint
    McInroe, Trevor
    Papoudakis, Georgios
    Rahman, Arrasy
    Schafer, Lukas
    Tamborski, Massimiliano
    Vecchio, Giuseppe
    Wang, Cheng
    Albrecht, Stefano, V
    [J]. AI COMMUNICATIONS, 2022, 35 (04) : 357 - 368
  • [48] Multi-Agent Reinforcement Learning with Reward Delays
    Zhang, Yuyang
    Zhang, Runyu
    Gu, Yuantao
    Li, Na
    [J]. LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [49] Quantum Multi-Agent Meta Reinforcement Learning
    Yun, Won Joon
    Park, Jihong
    Kim, Joongheon
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 11087 - 11095
  • [50] Multi-agent reinforcement learning for intrusion detection
    Servin, Arturo
    Kudenko, Daniel
    [J]. ADAPTIVE AGENTS AND MULTI-AGENT SYSTEMS, 2008, 4865 : 211 - 223