Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

被引：0

作者：

Chawla, Ronshee ^{[1
]}

Vial, Daniel ^{[1
,2
]}

Shakkottai, Sanjay ^{[1
]}

Srikant, R. ^{[2
]}

机构：

[1] Univ Texas Austin, Chandra Family Dept Elect & Comp Engn, Austin, TX 78712 USA

[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202 | 2023年 / 202卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The study of collaborative multi-agent bandits has attracted significant attention recently. In light of this, we initiate the study of a new collaborative setting, consisting of N agents such that each agent is learning one of M stochastic multi-armed bandits to minimize their group cumulative regret. We develop decentralized algorithms which facilitate collaboration between the agents under two scenarios. We characterize the performance of these algorithms by deriving the per agent cumulative regret and group regret upper bounds. We also prove lower bounds for the group regret in this setting, which demonstrates the near-optimal behavior of the proposed algorithms.

引用

页数：29

共 50 条

[21] ON MULTI-ARMED BANDITS AND DEBT COLLECTION
Czekaj, Lukasz
Biegus, Tomasz
Kitlowski, Robert
Tomasik, Pawel
36TH ANNUAL EUROPEAN SIMULATION AND MODELLING CONFERENCE, ESM 2022, 2022, : 137 - 141
[22] Visualizations for interrogations of multi-armed bandits
Keaton, Timothy J.
Sabbaghi, Arman
STAT, 2019, 8 (01):
[23] Multi-armed bandits with dependent arms
Singh, Rahul
Liu, Fang
Sun, Yin
Shroff, Ness
MACHINE LEARNING, 2024, 113 (01) : 45 - 71
[24] On Kernelized Multi-Armed Bandits with Constraints
Zhou, Xingyu
Ji, Bo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[25] Multi-Armed Bandits in Metric Spaces
Kleinberg, Robert
Slivkins, Aleksandrs
Upfal, Eli
STOC'08: PROCEEDINGS OF THE 2008 ACM INTERNATIONAL SYMPOSIUM ON THEORY OF COMPUTING, 2008, : 681 - +
[26] Multi-Armed Bandits With Costly Probes
Elumar, Eray Can
Tekin, Cem
Yagan, Osman
IEEE TRANSACTIONS ON INFORMATION THEORY, 2025, 71 (01) : 618 - 643
[27] Multi-armed bandits with episode context
Christopher D. Rosin
Annals of Mathematics and Artificial Intelligence, 2011, 61 : 203 - 230
[28] MULTI-ARMED BANDITS AND THE GITTINS INDEX
WHITTLE, P
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1980, 42 (02): : 143 - 149
[29] Multi-armed bandits with switching penalties
Asawa, M
Teneketzis, D
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1996, 41 (03) : 328 - 348
[30] On Optimal Foraging and Multi-armed Bandits
Srivastava, Vaibhav
Reverdy, Paul
Leonard, Naomi E.
2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 494 - 499

← 1 2 3 4 5 →