Near-optimal Per-Action Regret Bounds for Sleeping Bandits

被引：0

作者：

Quan Nguyen ^{[1
]}

Mehta, Nishant A. ^{[1
]}

机构：

[1] Univ Victoria, Dept Comp Sci, Victoria, BC, Canada

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238 | 2024年 / 238卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We derive near-optimal per-action regret bounds for sleeping bandits, in which both the sets of available arms and their losses in every round are chosen by an adversary. In a setting with K total arms and at most A available arms in each round over T rounds, the best known upper bound is O(K TA ln K), obtained indirectly via minimizing internal sleeping regrets. Compared to the minimax O( TA) lower bound, this upper bound contains an extra multiplicative factor of K ln K. We address this gap by directly minimizing the per-action regret using generalized versions of EXP3, EXP3-IX and FTRL with Tsallis entropy, thereby obtaining near-optimal bounds of order O(v TA ln K) and O( T v AK). We extend our results to the setting of bandits with advice from sleeping experts, generalizing EXP4 along the way. This leads to new proofs for a number of existing adaptive and tracking regret bounds for standard non-sleeping bandits. Extending our results to the bandit version of experts that report their confidences leads to new bounds for the confidence regret that depends primarily on the sum of experts' confidences. We prove a lower bound, showing that for any minimax optimal algorithms, there exists an action whose regret is sublinear in T but linear in the number of its active rounds.

引用

页数：36

共 50 条

[31] Near-Optimal Complexity Bounds for Fragments of the Skolem Problem
Akshay, S.
Balaji, Nikhil
Murhekar, Aniket
Varma, Rohith
Vyas, Nikhil
37TH INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS 2020), 2020, 154
[32] Multi-Armed Bandits with Bounded Arm-Memory: Near-Optimal Guarantees for Best-Arm Identification and Regret Minimization
Maiti, Arnab
Patil, Vishakha
Khan, Arindam
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[33] Near-Optimal No-Regret Algorithms for Zero-Sum Games
Daskalakis, Constantinos
Deckelbaum, Alan
Kim, Anthony
PROCEEDINGS OF THE TWENTY-SECOND ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2011, : 235 - 254
[34] Near-optimal discrete optimization for experimental design: a regret minimization approach
Allen-Zhu, Zeyuan
Li, Yuanzhi
Singh, Aarti
Wang, Yining
MATHEMATICAL PROGRAMMING, 2021, 186 (1-2) : 439 - 478
[35] Near-optimal discrete optimization for experimental design: a regret minimization approach
Zeyuan Allen-Zhu
Yuanzhi Li
Aarti Singh
Yining Wang
Mathematical Programming, 2021, 186 : 439 - 478
[36] Near-optimal no-regret algorithms for zero-sum games
Daskalakis, Constantinos
Deckelbaum, Alan
Kim, Anthony
GAMES AND ECONOMIC BEHAVIOR, 2015, 92 : 327 - 348
[37] Near-Optimal No-Regret Learning Dynamics for General Convex Games
Farina, Gabriele
Anagnostides, Ioannis
Luo, Haipeng
Lee, Chung-Wei
Kroer, Christian
Sandholm, Tuomas
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[38] A NEAR-OPTIMAL METHOD FOR REASONING ABOUT ACTION
PRATT, VR
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1980, 20 (02) : 231 - 254
[39] Near-Optimal Communication Lower Bounds for Approximate Nash Equilibria
Goos, Mika
Rubinstein, Aviad
2018 IEEE 59TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2018, : 397 - 403
[40] NEAR-OPTIMAL COMMUNICATION LOWER BOUNDS FOR APPROXIMATE NASH EQUILIBRIA
Goos, Mika
Rubinstein, Aviad
SIAM JOURNAL ON COMPUTING, 2023, 52 (06) : 316 - 348

← 1 2 3 4 5 →