Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information

被引：0

作者：

Auer, Peter ^{[1
]}

Chen, Yifang ^{[2
]}

Gajane, Pratik ^{[1
]}

Lee, Chung-Wei ^{[2
]}

Luo, Haipeng ^{[2
]}

Ortner, Ronald ^{[1
]}

Wei, Chen-Yu ^{[2
]}

机构：

[1] Montan Univ Leoben, Leoben, Austria

[2] Univ Southern Calif, Los Angeles, CA 90007 USA

来源：

CONFERENCE ON LEARNING THEORY, VOL 99 | 2019年 / 99卷

基金：

奥地利科学基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This joint extended abstract introduces and compares the results of (Auer et al., 2019) and (Chen et al., 2019), both of which resolve the problem of achieving optimal dynamic regret for nonstationary bandits without prior information on the non-stationarity. Specifically, Auer et al. (2019) resolve the problem for the traditional multi-armed bandits setting, while Chen et al. (2019) give a solution for the more general contextual bandits setting. Both works extend the key idea of (Auer et al., 2018) developed for a simpler two-armed setting.

引用

页数：5

共 50 条

[21] Time-Decaying Bandits for Non-stationary Systems
Komiyama, Junpei
Qin, Tao
WEB AND INTERNET ECONOMICS, 2014, 8877 : 460 - 466
[22] Beam Alignment for mmWave Using Non-Stationary Bandits
Gupta, Ruchir
Lakshmanan, K.
Sah, Abhay Kumar
IEEE COMMUNICATIONS LETTERS, 2020, 24 (11) : 2619 - 2622
[23] Non-Stationary Representation Learning in Sequential Linear Bandits
Qin, Yuzhen
Menara, Tommaso
Oymak, Samet
Ching, Shinung
Pasqualetti, Fabio
IEEE OPEN JOURNAL OF CONTROL SYSTEMS, 2022, 1 : 41 - 56
[24] Randomized Exploration for Non-Stationary Stochastic Linear Bandits
Kim, Baekjin
Tewari, Ambuj
CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 71 - 80
[25] Non-stationary Dueling Bandits for Online Learning to Rank
Lu, Shiyin
Miao, Yuan
Yang, Ping
Hu, Yao
Zhang, Lijun
WEB AND BIG DATA, PT II, APWEB-WAIM 2022, 2023, 13422 : 166 - 174
[26] Reward Attack on Stochastic Bandits with Non-stationary Rewards
Yang, Chenye
Liu, Guanlin
Lai, Lifeng
FIFTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, IEEECONF, 2023, : 1387 - 1393
[27] Non-stationary Projection-Free Online Learning with Dynamic and Adaptive Regret Guarantees
Wang, Yibo
Yang, Wenhao
Jiang, Wei
Lu, Shiyin
Wang, Bing
Tang, Haihong
Wan, Yuanyu
Zhang, Lijun
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15671 - 15679
[28] Non-stationary Risk-Sensitive Reinforcement Learning: Near-Optimal Dynamic Regret, Adaptive Detection, and Separation Design
Ding, Yuhao
Jin, Ming
Lavaei, Javad
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7405 - 7413
[29] Non-Stationary Bandits with Auto-Regressive Temporal Dependency
Chen, Qinyi
Golrezaei, Negin
Bouneffouf, Djallel
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[30] Stochastic Bandits With Non-Stationary Rewards: Reward Attack and Defense
Yang, Chenye
Liu, Guanlin
Lai, Lifeng
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 5007 - 5020

← 1 2 3 4 5 →