Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems

被引:0
|
作者
Asghari, Seyed Mohammad [1 ]
Ouyang, Yi [2 ]
Nayyar, Ashutosh [1 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90089 USA
[2] Preferred Networks Amer Inc, Tokyo, Japan
关键词
MULTIARMED BANDIT;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Regret analysis is challenging in Multi-Agent Reinforcement Learning (MARL) primarily due to the dynamical environments and the decentralized information among agents. We attempt to solve this challenge in the context of decentralized learning in multi-agent linear-quadratic (LQ) dynamical systems. We begin with a simple setup consisting of two agents and two dynamically decoupled stochastic linear systems, each system controlled by an agent. The systems are coupled through a quadratic cost function. When both systems' dynamics are unknown and there is no communication among the agents, we show that no learning policy can generate sub-linear in T regret, where T is the time horizon. When only one system's dynamics are unknown and there is one-directional communication from the agent controlling the unknown system to the other agent, we propose a MARL algorithm based on the construction of an auxiliary single-agent LQ problem. The auxiliary single-agent problem in the proposed MARL algorithm serves as an implicit coordination mechanism among the two learning agents. This allows the agents to achieve a regret within O(root T) of the regret of the auxiliary single-agent problem. Consequently, using existing results for single-agent LQ regret, our algorithm provides a (O) over tilde(root T) regret bound. (Here (O) over tilde(.) hides constants and logarithmic factors). Our numerical experiments indicate that this bound is matched in practice. From the two-agent problem, we extend our results to multi-agent LQ systems with certain communication patterns which appear in vehicle platoon control.
引用
收藏
页码:121 / 130
页数:10
相关论文
共 50 条
  • [1] Dynamic Belief for Decentralized Multi-Agent Cooperative Learning
    Zhai, Yunpeng
    Peng, Peixi
    Su, Chen
    Tian, Yonghong
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 344 - 352
  • [2] Decentralized control for deployment of multi-agent dynamical systems
    Topolewicz, Katarzyna
    Girejko, Ewa
    Olaru, Sorin
    [J]. PROCESS CONTROL '21 - PROCEEDING OF THE 2021 23RD INTERNATIONAL CONFERENCE ON PROCESS CONTROL (PC), 2021, : 25 - 30
  • [3] Implementing ReGreT in a decentralized multi-agent environment
    Koenig, Stefan
    Kaffille, Sven
    Wirtz, Guido
    [J]. MULTIAGENT SYSTEM TECHNOLOGIES, PROCEEDINGS, 2007, 4687 : 194 - +
  • [4] Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning
    Zimmer, Matthieu
    Glanois, Claire
    Siddique, Umer
    Weng, Paul
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [5] Decentralized Anomaly Detection in Cooperative Multi-Agent Reinforcement Learning
    Kazari, Kiarash
    Shereen, Ezzeldin
    Dan, Gyorgy
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 162 - 170
  • [6] Decentralized Constructive Collision Avoidance for Multi-Agent Dynamical Systems
    Nguyen, M. T.
    Maniu, C. Stoica
    Olaru, S.
    [J]. 2016 EUROPEAN CONTROL CONFERENCE (ECC), 2016, : 1526 - 1531
  • [7] On Decentralized Navigation Schemes for Coordination of Multi-Agent Dynamical Systems
    Roozbehani, Hajir
    Rudaz, Sylvain
    Gillet, Denis
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 4807 - 4812
  • [8] Decentralized reinforcement social learning based on cooperative policy exploration in multi-agent systems
    Wang, Chi
    Chen, Xin
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 1575 - 1580
  • [9] Decentralized intrusion detection for secure cooperative multi-agent systems
    Fagiolini, Adriano
    Valenti, Gianni
    Pallottino, Lucia
    Dini, Gianluca
    Bicchi, Antonio
    [J]. PROCEEDINGS OF THE 46TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2007, : 5487 - +
  • [10] Cooperative Learning Based on Multi-Agent Systems
    Cheng Xian-yi
    Qiu Jian-lin
    Liu Ying
    [J]. THIRD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING: WKDD 2010, PROCEEDINGS, 2010, : 455 - 457