Scalable Multi-Agent Reinforcement Learning with General Utilities

被引:0
|
作者
Ying, Donghao [1 ]
Ding, Yuhao [1 ]
Koppel, Alec [2 ]
Lavaei, Javad [1 ]
机构
[1] Univ Calif Berkeley, Dept Ind Engn & Operat Res, Berkeley, CA 94720 USA
[2] JP Morgan AI Res, Brooklyn, NY USA
关键词
COMPLEXITY;
D O I
10.23919/ACC55779.2023.10156072
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the scalable multi-agent reinforcement learning (MARL) with general utilities, defined as nonlinear functions of the team's long-term state-action occupancy measure. The objective is to find a localized policy that maximizes the average of the team's local utility functions without the full observability of each agent in the team. By exploiting the spatial correlation decay property of the network structure, we propose a scalable distributed policy gradient algorithm with shadow reward and localized policy that consists of three steps: (1) shadow reward estimation, (2) truncated shadow Q-function estimation, and (3) truncated policy gradient estimation and policy update. Our algorithm converges, with high probability, to epsilon-stationarity with O(epsilon(-2)) samples up to some approximation error that decreases exponentially in the communication radius. This is the first result in the literature on multi-agent RL with general utilities that does not require the full observability.
引用
收藏
页码:3977 / 3982
页数:6
相关论文
共 50 条
  • [41] Multi-agent Reinforcement Learning for Service Composition
    Lei, Yu
    Yu, Philip S.
    PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (SCC 2016), 2016, : 790 - 793
  • [42] Coding for Distributed Multi-Agent Reinforcement Learning
    Wang, Baoqian
    Xie, Junfei
    Atanasov, Nikolay
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 10625 - 10631
  • [43] Multi-agent reinforcement learning with adaptive mimetism
    Yamaguchi, T
    Miura, M
    Yachida, M
    ETFA '96 - 1996 IEEE CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION, PROCEEDINGS, VOLS 1 AND 2, 1996, : 288 - 294
  • [44] Multi-agent Reinforcement Learning in Network Management
    Bagnasco, Ricardo
    Serrat, Joan
    SCALABILITY OF NETWORKS AND SERVICES, PROCEEDINGS, 2009, 5637 : 199 - 202
  • [45] HALFTONING WITH MULTI-AGENT DEEP REINFORCEMENT LEARNING
    Jiang, Haitian
    Xiong, Dongliang
    Jiang, Xiaowen
    Yin, Aiguo
    Ding, Li
    Huang, Kai
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 641 - 645
  • [46] Multi-Agent Reinforcement Learning with Reward Delays
    Zhang, Yuyang
    Zhang, Runyu
    Gu, Yuantao
    Li, Na
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [47] Deep reinforcement learning for multi-agent interaction
    Ahmed, Ibrahim H.
    Brewitt, Cillian
    Carlucho, Ignacio
    Christianos, Filippos
    Dunion, Mhairi
    Fosong, Elliot
    Garcin, Samuel
    Guo, Shangmin
    Gyevnar, Balint
    McInroe, Trevor
    Papoudakis, Georgios
    Rahman, Arrasy
    Schafer, Lukas
    Tamborski, Massimiliano
    Vecchio, Giuseppe
    Wang, Cheng
    Albrecht, Stefano, V
    AI COMMUNICATIONS, 2022, 35 (04) : 357 - 368
  • [48] Multi-agent reinforcement learning for intrusion detection
    Servin, Arturo
    Kudenko, Daniel
    ADAPTIVE AGENTS AND MULTI-AGENT SYSTEMS, 2008, 4865 : 211 - 223
  • [49] Quantum Multi-Agent Meta Reinforcement Learning
    Yun, Won Joon
    Park, Jihong
    Kim, Joongheon
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 11087 - 11095
  • [50] Multi-agent reinforcement learning: weighting and partitioning
    Sun, R
    Peterson, T
    NEURAL NETWORKS, 1999, 12 (4-5) : 727 - 753