Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

被引:0
|
作者
Chawla, Ronshee [1 ]
Vial, Daniel [1 ,2 ]
Shakkottai, Sanjay [1 ]
Srikant, R. [2 ]
机构
[1] Univ Texas Austin, Chandra Family Dept Elect & Comp Engn, Austin, TX 78712 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The study of collaborative multi-agent bandits has attracted significant attention recently. In light of this, we initiate the study of a new collaborative setting, consisting of N agents such that each agent is learning one of M stochastic multi-armed bandits to minimize their group cumulative regret. We develop decentralized algorithms which facilitate collaboration between the agents under two scenarios. We characterize the performance of these algorithms by deriving the per agent cumulative regret and group regret upper bounds. We also prove lower bounds for the group regret in this setting, which demonstrates the near-optimal behavior of the proposed algorithms.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] ON MULTI-ARMED BANDITS AND DEBT COLLECTION
    Czekaj, Lukasz
    Biegus, Tomasz
    Kitlowski, Robert
    Tomasik, Pawel
    36TH ANNUAL EUROPEAN SIMULATION AND MODELLING CONFERENCE, ESM 2022, 2022, : 137 - 141
  • [22] Visualizations for interrogations of multi-armed bandits
    Keaton, Timothy J.
    Sabbaghi, Arman
    STAT, 2019, 8 (01):
  • [23] Multi-armed bandits with dependent arms
    Singh, Rahul
    Liu, Fang
    Sun, Yin
    Shroff, Ness
    MACHINE LEARNING, 2024, 113 (01) : 45 - 71
  • [24] On Kernelized Multi-Armed Bandits with Constraints
    Zhou, Xingyu
    Ji, Bo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [25] Multi-Armed Bandits in Metric Spaces
    Kleinberg, Robert
    Slivkins, Aleksandrs
    Upfal, Eli
    STOC'08: PROCEEDINGS OF THE 2008 ACM INTERNATIONAL SYMPOSIUM ON THEORY OF COMPUTING, 2008, : 681 - +
  • [26] Multi-Armed Bandits With Costly Probes
    Elumar, Eray Can
    Tekin, Cem
    Yagan, Osman
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2025, 71 (01) : 618 - 643
  • [27] Multi-armed bandits with episode context
    Christopher D. Rosin
    Annals of Mathematics and Artificial Intelligence, 2011, 61 : 203 - 230
  • [28] MULTI-ARMED BANDITS AND THE GITTINS INDEX
    WHITTLE, P
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1980, 42 (02): : 143 - 149
  • [29] Multi-armed bandits with switching penalties
    Asawa, M
    Teneketzis, D
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1996, 41 (03) : 328 - 348
  • [30] On Optimal Foraging and Multi-armed Bandits
    Srivastava, Vaibhav
    Reverdy, Paul
    Leonard, Naomi E.
    2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 494 - 499