No-regret learning for repeated concave games with lossy bandits

被引:2
|
作者
Liu, Wenting [1 ]
Lei, Jinlong [1 ,2 ]
Yi, Peng [1 ,2 ]
机构
[1] Tongji Univ, Dept Control Sci & Engn, Shanghai 201804, Peoples R China
[2] Tongji Univ, Inst Adv Study, Shanghai 201804, Peoples R China
关键词
ALGORITHMS;
D O I
10.1109/CDC45484.2021.9683166
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper considers no-regret learning for repeated continuous-kernel games with lossy bandit information. At each round, each player chooses an action perturbed around its intended action, and gets the utility value at the corresponding action profile. However, due to various uncertainties or high inquiring costs, the bandit feedback may be lost at random. Therefore, we focus on studying the asynchronous learning strategy of the players to adaptively adjust next actions for minimizing the long-term regret loss compared with a best-fixed action in the hindsight. The paper provides a novel no-regret learning algorithm, called Reweighted Online Gradient Descent with bandit (ROGD-b). We first give the regret analysis for continuous concave games with differentiable and Lipschitz utilities. Furthermore, we show that the action profile converges to Nash equilibrium with probability 1 when the game is strictly monotone. Numerical experiments are given to illustrate the performance of the algorithm.
引用
收藏
页码:936 / 941
页数:6
相关论文
共 50 条
  • [1] No-regret learning for repeated non-cooperative games with lossy bandits
    Liu, Wenting
    Lei, Jinlong
    Yi, Peng
    Hong, Yiguang
    [J]. AUTOMATICA, 2024, 160
  • [2] No-Regret Learning in Bayesian Games
    Hartline, Jason
    Syrgkanis, Vasilis
    Tardos, Eva
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [3] Limits and limitations of no-regret learning in games
    Monnot, Barnabe
    Piliouras, Georgios
    [J]. KNOWLEDGE ENGINEERING REVIEW, 2017, 32
  • [4] No-Regret Learning in Dynamic Stackelberg Games
    Lauffer, Niklas
    Ghasemi, Mahsa
    Hashemi, Abolfazl
    Savas, Yagiz
    Topcu, Ufuk
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (03) : 1418 - 1431
  • [5] Doubly Optimal No-Regret Learning in Monotone Games
    Cai, Yang
    Zheng, Weiqiang
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [6] An α-No-Regret Algorithm For Graphical Bilinear Bandits
    Rizk, Geovani
    Colin, Igor
    Thomas, Albert
    Laraki, Rida
    Chevaleyre, Yann
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [7] No-Regret Learning in Unknown Games with Correlated Payoffs
    Sessa, Pier Giuseppe
    Bogunovic, Ilija
    Kamgarpour, Maryam
    Krause, Andreas
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [8] No-Regret Linear Bandits beyond Realizability
    Liu, Chong
    Yin, Ming
    Wang, Yu-Xiang
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1294 - 1303
  • [9] Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback
    Jordan, Michael
    Lin, Tianyi
    Zhou, Zhengyuan
    [J]. OPERATIONS RESEARCH, 2024,
  • [10] Near-Optimal No-Regret Learning in General Games
    Daskalakis, Constantinos
    Fishelson, Maxwell
    Golowich, Noah
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34