Kernel Methods for Cooperative Multi-Agent Contextual Bandits

被引：0

作者：

Dubey, Abhimanyu ^{[1
,2
]}

Pentland, Alex ^{[1
,2
]}

机构：

[1] MIT, Media Lab, Cambridge, MA 02139 USA

[2] MIT, Inst Data Syst & Soc, Cambridge, MA 02139 USA

来源：

25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019) | 2019年

关键词：

MULTIARMED BANDIT;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Cooperative multi-agent decision making involves a group of agents cooperatively solving learning problems while communicating over a network with delays. In this paper, we consider the kernelised contextual bandit problem, where the reward obtained by an agent is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS), and a group of agents must cooperate to collectively solve their unique decision problems. For this problem, we propose COOP-KERNELUCB, an algorithm that provides near-optimal bounds on the per-agent regret, and is both computationally and communicatively efficient. For special cases of the cooperative problem, we also provide variants of COOP-KERNELUCB that provides optimal peragent regret. In addition, our algorithm generalizes several existing results in the multi-agent bandit setting. Finally, on a series of both synthetic and real-world multi-agent network benchmarks, we demonstrate that our algorithm significantly outperforms existing benchmarks.

引用

页数：11

共 50 条

[21] Robust Multi-Agent Bandits Over Undirected Graphs
Vial, Daniel
Shakkottai, Sanjay
Srikant, R.
PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2022, 6 (03)
[22] Testing Reinforcement Learning Explainability Methods in a Multi-Agent Cooperative Environment
Domenech i Vila, Marc
Gnatyshak, Dmitry
Tormos, Adrian
Alvarez-Napagao, Sergio
ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2022, 356 : 355 - 364
[23] Multi-Agent Cooperative Target Search
Hu, Jinwen
Xie, Lihua
Xu, Jun
Xu, Zhao
SENSORS, 2014, 14 (06) : 9408 - 9428
[24] Multi-agent network for cooperative work
Lab. Natl. de Info. Avanzada, LANIA A. C., A.P. 696, Xalapa, Veracruz, Mexico
Expert Systems with Applications, 14 (1-2): : 117 - 127
[25] Cooperative Multi-Agent Planning: A Survey
Torreno, Alejandro
Onaindia, Eva
Komenda, Antonin
Stolba, Michal
ACM COMPUTING SURVEYS, 2018, 50 (06)
[26] Multi-agent network for cooperative work
Lemaitre, C
Excelente, CB
EXPERT SYSTEMS WITH APPLICATIONS, 1998, 14 (1-2) : 117 - 127
[27] Cooperative Multi-agent Policy Gradient
Bono, Guillaume
Dibangoye, Jilles Steeve
Matignon, Laetitia
Pereyron, Florian
Simonin, Olivier
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT I, 2019, 11051 : 459 - 476
[28] An evidential cooperative multi-agent system
Benouhiba, T
Nigro, JM
EXPERT SYSTEMS WITH APPLICATIONS, 2006, 30 (02) : 255 - 264
[29] Modelling cooperative multi-agent systems
Shan, LJ
Zhu, H
GRID AND COOPERATIVE COMPUTING, PT 2, 2004, 3033 : 994 - 1001
[30] The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits
Chawla, Ronshee
Sankararaman, Abishek
Ganesh, Ayalvadi
Shakkottai, Sanjay
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 3471 - 3480

← 1 2 3 4 5 →