Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

被引:0
|
作者
Chawla, Ronshee [1 ]
Vial, Daniel [1 ,2 ]
Shakkottai, Sanjay [1 ]
Srikant, R. [2 ]
机构
[1] Univ Texas Austin, Chandra Family Dept Elect & Comp Engn, Austin, TX 78712 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The study of collaborative multi-agent bandits has attracted significant attention recently. In light of this, we initiate the study of a new collaborative setting, consisting of N agents such that each agent is learning one of M stochastic multi-armed bandits to minimize their group cumulative regret. We develop decentralized algorithms which facilitate collaboration between the agents under two scenarios. We characterize the performance of these algorithms by deriving the per agent cumulative regret and group regret upper bounds. We also prove lower bounds for the group regret in this setting, which demonstrates the near-optimal behavior of the proposed algorithms.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] MULTI-ARMED BANDITS IN MULTI-AGENT NETWORKS
    Shahrampour, Shahin
    Rakhlin, Alexander
    Jadbabaie, Ali
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2786 - 2790
  • [2] Multi-Agent Multi-Armed Bandits with Limited Communication
    Agarwal, Mridul
    Aggarwal, Vaneet
    Azizzadenesheli, Kamyar
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23 : 1 - 24
  • [3] Fair Algorithms for Multi-Agent Multi-Armed Bandits
    Hossain, Safwan
    Micha, Evi
    Shah, Nisarg
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [4] Coordinated Versus Decentralized Exploration In Multi-Agent Multi-Armed Bandits
    Chakraborty, Mithun
    Chua, Kai Yee Phoebe
    Das, Sanmay
    Juba, Brendan
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 164 - 170
  • [5] Distributed cooperative decision making in multi-agent multi-armed bandits
    Landgren, Peter
    Srivastava, Vaibhav
    Leonard, Naomi Ehrich
    AUTOMATICA, 2021, 125
  • [6] Multi-Armed Bandits for Spectrum Allocation in Multi-Agent Channel Bonding WLANs
    Barrachina-Munoz, Sergio
    Chiumento, Alessandro
    Bellalta, Boris
    IEEE ACCESS, 2021, 9 : 133472 - 133490
  • [7] Statistical and Computational Trade-off in Multi-Agent Multi-Armed Bandits
    Vannella, Filippo
    Protiuere, Alexandre
    Jeong, Jaeseong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] Relational Weight Optimization for Enhancing Team Performance in Multi-Agent Multi-Armed Bandits
    Kotturu, Monish Reddy
    Movahed, Saniya Vahedian
    Robinette, Paul
    Jerath, Kshitij
    Redlich, Amanda
    Azadeh, Reza
    IFAC PAPERSONLINE, 2024, 58 (28): : 492 - 497
  • [9] Decentralized Randomly Distributed Multi-agent Multi-armed Bandit with Heterogeneous Rewards
    Xu, Mengfan
    Klabjan, Diego
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] On Kernelized Multi-armed Bandits
    Chowdhury, Sayak Ray
    Gopalan, Aditya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70