Optimal Algorithms for Multiplayer Multi-Armed Bandits

被引：0

作者：

Wang, Po-An ^{[1
]}

Proutiere, Alexandre ^{[1
]}

Ariu, Kaito ^{[1
]}

Jedra, Yassir ^{[1
]}

Russo, Alessio ^{[1
]}

机构：

[1] Royal Inst Technol, KTH, Stockholm, Sweden

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108 | 2020年 / 108卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The paper addresses various Multiplayer Multi-Armed Bandit (MMAB) problems, where M decision-makers, or players, collaborate to maximize their cumulative reward. We first investigate the MMAB problem where players selecting the same arms experience a collision (and are aware of it) and do not collect any reward. For this problem, we present DPE1 (Decentralized Parsimonious Exploration), a decentralized algorithm that achieves the same asymptotic regret as that obtained by an optimal centralized algorithm. DPE1 is simpler than the state-of-the-art algorithm SIC-MMAB Boursier and Pen-het (2019), and yet offers better performance guarantees. We then study the MMAB problem without collision, where players may select the same arm. Players sit on vertices of a graph, and in each round, they are able to send a message to their neighbours in the graph. We present DPE2, a simple and asymptotically optimal algorithm that outperforms the state-of-the-art algorithm DD-UCB Martinez-Rubio et al. (2019). Besides, under DPE2, the expected number of bits transmitted by the players in the graph is finite.

引用

页数：9

共 50 条

[11] Quantum greedy algorithms for multi-armed bandits
Ohno, Hiroshi
QUANTUM INFORMATION PROCESSING, 2023, 22 (02)
[12] Multi-armed Bandits: Competing with Optimal Sequences
Anava, Oren
Karnin, Zohar
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[13] SIC - MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits
Boursier, Etienne
Perchet, Vianney
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[14] Fair Algorithms for Multi-Agent Multi-Armed Bandits
Hossain, Safwan
Micha, Evi
Shah, Nisarg
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[15] Efficient algorithms for multi-armed bandits with additional feedbacks: Modeling and algorithms
Xie, Hong
Gu, Haoran
Qi, Zhi
INFORMATION SCIENCES, 2023, 633 : 453 - 468
[16] Regional Multi-Armed Bandits
Wang, Zhiyang
Zhou, Ruida
Shen, Cong
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[17] On Kernelized Multi-armed Bandits
Chowdhury, Sayak Ray
Gopalan, Aditya
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[18] Multi-armed Bandits with Compensation
Wang, Siwei
Huang, Longbo
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[19] Federated Multi-Armed Bandits
Shi, Chengshuai
Shen, Cong
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9603 - 9611
[20] Ballooning multi-armed bandits
Ghalme, Ganesh
Dhamal, Swapnil
Jain, Shweta
Gujar, Sujit
Narahari, Y.
ARTIFICIAL INTELLIGENCE, 2021, 296

← 1 2 3 4 5 →