Kernel Methods for Cooperative Multi-Agent Contextual Bandits

被引:0
|
作者
Dubey, Abhimanyu [1 ,2 ]
Pentland, Alex [1 ,2 ]
机构
[1] MIT, Media Lab, Cambridge, MA 02139 USA
[2] MIT, Inst Data Syst & Soc, Cambridge, MA 02139 USA
关键词
MULTIARMED BANDIT;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cooperative multi-agent decision making involves a group of agents cooperatively solving learning problems while communicating over a network with delays. In this paper, we consider the kernelised contextual bandit problem, where the reward obtained by an agent is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS), and a group of agents must cooperate to collectively solve their unique decision problems. For this problem, we propose COOP-KERNELUCB, an algorithm that provides near-optimal bounds on the per-agent regret, and is both computationally and communicatively efficient. For special cases of the cooperative problem, we also provide variants of COOP-KERNELUCB that provides optimal peragent regret. In addition, our algorithm generalizes several existing results in the multi-agent bandit setting. Finally, on a series of both synthetic and real-world multi-agent network benchmarks, we demonstrate that our algorithm significantly outperforms existing benchmarks.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] A Generic Agent Architecture for Cooperative Multi-agent Games
    Marinheiro, Joao
    Cardoso, Henrique Lopes
    ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1, 2017, : 107 - 118
  • [32] A novel multi-agent Q-learning algorithm in cooperative multi-agent system
    Ou, HT
    Zhang, WD
    Zhang, WY
    Xu, XM
    PROCEEDINGS OF THE 3RD WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-5, 2000, : 272 - 276
  • [33] Coordinated Versus Decentralized Exploration In Multi-Agent Multi-Armed Bandits
    Chakraborty, Mithun
    Chua, Kai Yee Phoebe
    Das, Sanmay
    Juba, Brendan
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 164 - 170
  • [34] Kernel-based Multi-Task Contextual Bandits in Cellular Network Configuration
    Wang, Xiaoxiao
    Guo, Xueying
    Chuai, Jie
    Chen, Zhitang
    Liu, Xin
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 1517 - 1526
  • [35] Dynamic contextual regulations in open multi-agent systems
    Felicissimo, Carolina Howard
    Semantic Web - ISEC 2006, Proceedings, 2006, 4273 : 974 - 975
  • [36] Multi-agent robotic cooperative assembly system
    王越超
    谈大龙
    蔡鹤皋
    Journal of Harbin Institute of Technology, 2000, (01) : 1 - 5
  • [37] Cooperative Multi-Agent Learning: The State of the Art
    Liviu Panait
    Sean Luke
    Autonomous Agents and Multi-Agent Systems, 2005, 11 : 387 - 434
  • [38] Multi-Agent Grouping and Routing with Cooperative Tasks
    Choi, Euihyeon
    Ahn, Jaemyung
    JOURNAL OF AEROSPACE INFORMATION SYSTEMS, 2025, 22 (03): : 163 - 176
  • [39] Cooperative Localization of Networked Multi-agent System
    Lin, Jiaying
    Gehrt, Jan-Joran
    Zweigel, Rene
    Abel, Dirk
    PROCEEDINGS OF THE 32ND INTERNATIONAL TECHNICAL MEETING OF THE SATELLITE DIVISION OF THE INSTITUTE OF NAVIGATION (ION GNSS+ 2019), 2019, : 1976 - 1989
  • [40] FMAP: Distributed cooperative multi-agent planning
    Torreno, Alejandro
    Onaindia, Eva
    Sapena, Oscar
    APPLIED INTELLIGENCE, 2014, 41 (02) : 606 - 626