Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

被引:0
|
作者
Chawla, Ronshee [1 ]
Vial, Daniel [1 ,2 ]
Shakkottai, Sanjay [1 ]
Srikant, R. [2 ]
机构
[1] Univ Texas Austin, Chandra Family Dept Elect & Comp Engn, Austin, TX 78712 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The study of collaborative multi-agent bandits has attracted significant attention recently. In light of this, we initiate the study of a new collaborative setting, consisting of N agents such that each agent is learning one of M stochastic multi-armed bandits to minimize their group cumulative regret. We develop decentralized algorithms which facilitate collaboration between the agents under two scenarios. We characterize the performance of these algorithms by deriving the per agent cumulative regret and group regret upper bounds. We also prove lower bounds for the group regret in this setting, which demonstrates the near-optimal behavior of the proposed algorithms.
引用
收藏
页数:29
相关论文
共 50 条
  • [41] Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits
    Karpov, Nikolai
    Zhang, Qin
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 12, 2024, : 13076 - 13084
  • [42] Collaborative Multi-agent Stochastic Linear Bandits
    Moradipari, Ahmadreza
    Ghavamzadeh, Mohammad
    Alizadeh, Mahnoosh
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 2761 - 2766
  • [43] Multi-Fidelity Multi-Armed Bandits Revisited
    Wang, Xuchuang
    Wu, Qingyun
    Chen, Wei
    Lui, John C. S.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [44] Multi-agent Heterogeneous Stochastic Linear Bandits
    Ghosh, Avishek
    Sankararaman, Abishek
    Ramchandran, Kannan
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT IV, 2023, 13716 : 300 - 316
  • [45] A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem
    Madhushani, Udari
    Leonard, Naomi Ehrich
    2020 EUROPEAN CONTROL CONFERENCE (ECC 2020), 2020, : 1677 - 1682
  • [46] LEVY BANDITS: MULTI-ARMED BANDITS DRIVEN BY LEVY PROCESSES
    Kaspi, Haya
    Mandelbaum, Avi
    ANNALS OF APPLIED PROBABILITY, 1995, 5 (02): : 541 - 565
  • [47] Successive Reduction of Arms in Multi-Armed Bandits
    Gupta, Neha
    Granmo, Ole-Christoffer
    Agrawala, Ashok
    RESEARCH AND DEVELOPMENT IN INTELLIGENT SYSTEMS XXVIII: INCORPORATING APPLICATIONS AND INNOVATIONS IN INTELLIGENT SYSTEMS XIX, 2011, : 181 - +
  • [48] Quantum greedy algorithms for multi-armed bandits
    Hiroshi Ohno
    Quantum Information Processing, 22
  • [49] Online Multi-Armed Bandits with Adaptive Inference
    Dimakopoulou, Maria
    Ren, Zhimei
    Zhou, Zhengyuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [50] Multi-Armed Bandits for Adaptive Constraint Propagation
    Balafrej, Amine
    Bessiere, Christian
    Paparrizou, Anastasia
    PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 290 - 296