Optimal Algorithms for Multiplayer Multi-Armed Bandits

被引:0
|
作者
Wang, Po-An [1 ]
Proutiere, Alexandre [1 ]
Ariu, Kaito [1 ]
Jedra, Yassir [1 ]
Russo, Alessio [1 ]
机构
[1] Royal Inst Technol, KTH, Stockholm, Sweden
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper addresses various Multiplayer Multi-Armed Bandit (MMAB) problems, where M decision-makers, or players, collaborate to maximize their cumulative reward. We first investigate the MMAB problem where players selecting the same arms experience a collision (and are aware of it) and do not collect any reward. For this problem, we present DPE1 (Decentralized Parsimonious Exploration), a decentralized algorithm that achieves the same asymptotic regret as that obtained by an optimal centralized algorithm. DPE1 is simpler than the state-of-the-art algorithm SIC-MMAB Boursier and Pen-het (2019), and yet offers better performance guarantees. We then study the MMAB problem without collision, where players may select the same arm. Players sit on vertices of a graph, and in each round, they are able to send a message to their neighbours in the graph. We present DPE2, a simple and asymptotically optimal algorithm that outperforms the state-of-the-art algorithm DD-UCB Martinez-Rubio et al. (2019). Besides, under DPE2, the expected number of bits transmitted by the players in the graph is finite.
引用
收藏
页数:9
相关论文
共 50 条
  • [11] Quantum greedy algorithms for multi-armed bandits
    Ohno, Hiroshi
    QUANTUM INFORMATION PROCESSING, 2023, 22 (02)
  • [12] Multi-armed Bandits: Competing with Optimal Sequences
    Anava, Oren
    Karnin, Zohar
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [13] SIC - MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits
    Boursier, Etienne
    Perchet, Vianney
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [14] Fair Algorithms for Multi-Agent Multi-Armed Bandits
    Hossain, Safwan
    Micha, Evi
    Shah, Nisarg
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [15] Efficient algorithms for multi-armed bandits with additional feedbacks: Modeling and algorithms
    Xie, Hong
    Gu, Haoran
    Qi, Zhi
    INFORMATION SCIENCES, 2023, 633 : 453 - 468
  • [16] Regional Multi-Armed Bandits
    Wang, Zhiyang
    Zhou, Ruida
    Shen, Cong
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [17] On Kernelized Multi-armed Bandits
    Chowdhury, Sayak Ray
    Gopalan, Aditya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [18] Multi-armed Bandits with Compensation
    Wang, Siwei
    Huang, Longbo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [19] Federated Multi-Armed Bandits
    Shi, Chengshuai
    Shen, Cong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9603 - 9611
  • [20] Ballooning multi-armed bandits
    Ghalme, Ganesh
    Dhamal, Swapnil
    Jain, Shweta
    Gujar, Sujit
    Narahari, Y.
    ARTIFICIAL INTELLIGENCE, 2021, 296