Kernel Methods for Cooperative Multi-Agent Contextual Bandits

被引:0
|
作者
Dubey, Abhimanyu [1 ,2 ]
Pentland, Alex [1 ,2 ]
机构
[1] MIT, Media Lab, Cambridge, MA 02139 USA
[2] MIT, Inst Data Syst & Soc, Cambridge, MA 02139 USA
关键词
MULTIARMED BANDIT;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cooperative multi-agent decision making involves a group of agents cooperatively solving learning problems while communicating over a network with delays. In this paper, we consider the kernelised contextual bandit problem, where the reward obtained by an agent is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS), and a group of agents must cooperate to collectively solve their unique decision problems. For this problem, we propose COOP-KERNELUCB, an algorithm that provides near-optimal bounds on the per-agent regret, and is both computationally and communicatively efficient. For special cases of the cooperative problem, we also provide variants of COOP-KERNELUCB that provides optimal peragent regret. In addition, our algorithm generalizes several existing results in the multi-agent bandit setting. Finally, on a series of both synthetic and real-world multi-agent network benchmarks, we demonstrate that our algorithm significantly outperforms existing benchmarks.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Robust Multi-Agent Bandits Over Undirected Graphs
    Vial, Daniel
    Shakkottai, Sanjay
    Srikant, R.
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2022, 6 (03)
  • [22] Testing Reinforcement Learning Explainability Methods in a Multi-Agent Cooperative Environment
    Domenech i Vila, Marc
    Gnatyshak, Dmitry
    Tormos, Adrian
    Alvarez-Napagao, Sergio
    ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2022, 356 : 355 - 364
  • [23] Multi-Agent Cooperative Target Search
    Hu, Jinwen
    Xie, Lihua
    Xu, Jun
    Xu, Zhao
    SENSORS, 2014, 14 (06) : 9408 - 9428
  • [24] Multi-agent network for cooperative work
    Lab. Natl. de Info. Avanzada, LANIA A. C., A.P. 696, Xalapa, Veracruz, Mexico
    Expert Systems with Applications, 14 (1-2): : 117 - 127
  • [25] Cooperative Multi-Agent Planning: A Survey
    Torreno, Alejandro
    Onaindia, Eva
    Komenda, Antonin
    Stolba, Michal
    ACM COMPUTING SURVEYS, 2018, 50 (06)
  • [26] Multi-agent network for cooperative work
    Lemaitre, C
    Excelente, CB
    EXPERT SYSTEMS WITH APPLICATIONS, 1998, 14 (1-2) : 117 - 127
  • [27] Cooperative Multi-agent Policy Gradient
    Bono, Guillaume
    Dibangoye, Jilles Steeve
    Matignon, Laetitia
    Pereyron, Florian
    Simonin, Olivier
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT I, 2019, 11051 : 459 - 476
  • [28] An evidential cooperative multi-agent system
    Benouhiba, T
    Nigro, JM
    EXPERT SYSTEMS WITH APPLICATIONS, 2006, 30 (02) : 255 - 264
  • [29] Modelling cooperative multi-agent systems
    Shan, LJ
    Zhu, H
    GRID AND COOPERATIVE COMPUTING, PT 2, 2004, 3033 : 994 - 1001
  • [30] The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits
    Chawla, Ronshee
    Sankararaman, Abishek
    Ganesh, Ayalvadi
    Shakkottai, Sanjay
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3471 - 3480