An α-No-Regret Algorithm For Graphical Bilinear Bandits

被引:0
|
作者
Rizk, Geovani [1 ]
Colin, Igor [2 ]
Thomas, Albert [2 ]
Laraki, Rida [1 ]
Chevaleyre, Yann [1 ]
机构
[1] Univ Paris 09, PSL, CNRS, LAMSADE, Paris, France
[2] Huawei Noahs Ark Lab, Paris, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose the first regret-based approach to the Graphical Bilinear Bandits problem, where n agents in a graph play a stochastic bilinear bandit game with each of their neighbors. This setting reveals a combinatorial NP-hard problem that prevents the use of any existing regret-based algorithm in the (bi-)linear bandit literature. In this paper, we fill this gap and present the first regret-based algorithm for graphical bilinear bandits using the principle of optimism in the face of uncertainty. Theoretical analysis of this new method yields an upper bound of (O) over tilde(root T) on the.-regret and evidences the impact of the graph structure on the rate of convergence. Finally, we show through various experiments the validity of our approach.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] No-Regret Linear Bandits beyond Realizability
    Liu, Chong
    Yin, Ming
    Wang, Yu-Xiang
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1294 - 1303
  • [2] No-Regret Algorithms for Heavy-Tailed Linear Bandits
    Medina, Andres Munoz
    Yang, Scott
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [3] No-regret learning for repeated concave games with lossy bandits
    Liu, Wenting
    Lei, Jinlong
    Yi, Peng
    [J]. 2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 936 - 941
  • [4] Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs
    Zhong, Han
    Huang, Jiayi
    Yang, Lin F.
    Wang, Liwei
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] Best Arm Identification in Graphical Bilinear Bandits
    Rizk, Geovani
    Thomas, Albert
    Colin, Igor
    Laraki, Rida
    Chevaleyre, Yann
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [6] No-regret learning for repeated non-cooperative games with lossy bandits
    Liu, Wenting
    Lei, Jinlong
    Yi, Peng
    Hong, Yiguang
    [J]. AUTOMATICA, 2024, 160
  • [7] Memory-Constrained No-Regret Learning in Adversarial Multi-Armed Bandits
    Xu, Xiao
    Zhao, Qing
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 2371 - 2382
  • [8] No-regret boosting
    Gambin, Anna
    Szczurek, Ewa
    [J]. ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, PT 1, 2007, 4431 : 422 - +
  • [9] Improved Regret Bounds of Bilinear Bandits using Action Space Analysis
    Jang, Kyoungseok
    Jun, Kwang-Sung
    Yun, Se-Young
    Kang, Wanmo
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [10] Differentially Private Algorithm for Graphical Bandits
    Lu S.-Y.
    Wang G.-H.
    Qiu Z.-H.
    Zhang L.-J.
    [J]. Ruan Jian Xue Bao/Journal of Software, 2022, 33 (09):