Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information

被引:0
|
作者
Auer, Peter [1 ]
Chen, Yifang [2 ]
Gajane, Pratik [1 ]
Lee, Chung-Wei [2 ]
Luo, Haipeng [2 ]
Ortner, Ronald [1 ]
Wei, Chen-Yu [2 ]
机构
[1] Montan Univ Leoben, Leoben, Austria
[2] Univ Southern Calif, Los Angeles, CA 90007 USA
来源
基金
奥地利科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This joint extended abstract introduces and compares the results of (Auer et al., 2019) and (Chen et al., 2019), both of which resolve the problem of achieving optimal dynamic regret for nonstationary bandits without prior information on the non-stationarity. Specifically, Auer et al. (2019) resolve the problem for the traditional multi-armed bandits setting, while Chen et al. (2019) give a solution for the more general contextual bandits setting. Both works extend the key idea of (Auer et al., 2018) developed for a simpler two-armed setting.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits
    Saha, Aadirupa
    Gupta, Shubham
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 19027 - 19049
  • [2] A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits
    Abbasi-Yadkori, Yasin
    Gyorgy, Andraes
    Lazic, Nevena
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [3] Some algorithms for correlated bandits with non-stationary rewards : Regret bounds and applications
    Mayekar, Prathamesh
    Hemachandra, Nandyala
    [J]. PROCEEDINGS OF THE THIRD ACM IKDD CONFERENCE ON DATA SCIENCES (CODS), 2016,
  • [4] Non-Stationary Bandits under Recharging Payoffs: Improved Planning with Sublinear Regret
    Papadigenopoulos, Orestis
    Caramanis, Constantine
    Shakkottai, Sanjay
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [5] Non-stationary Bandits with Knapsacks
    Liu, Shang
    Jiang, Jiashuo
    Li, Xiaocheng
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [6] Unifying Clustered and Non-stationary Bandits
    Li, Chuanhao
    Wu, Qingyun
    Wang, Hongning
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [7] Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems
    Luo, Yuwei
    Gupta, Varun
    Kolar, Mladen
    [J]. PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2022, 6 (01)
  • [8] Cascading Non-Stationary Bandits: Online Learning to Rank in the Non-Stationary Cascade Model
    Li, Chang
    de Rijke, Maarten
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2859 - 2865
  • [9] Competing Bandits in Non-Stationary Matching Markets
    Ghosh, Avishek
    Sankararaman, Abishek
    Ramchandran, Kannan
    Javidi, Tara
    Mazumdar, Arya
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (04) : 2831 - 2850
  • [10] Weighted Linear Bandits for Non-Stationary Environments
    Russac, Yoan
    Vernade, Claire
    Cappe, Olivier
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32