Combinatorial Semi-Bandits with Knapsacks

被引：0

作者：

Sankararaman, Karthik Abinav ^{[1
]}

Slivkins, Aleksandrs ^{[2
]}

机构：

[1] Univ Maryland, College Pk, MD 20742 USA

[2] Microsoft Res NYC, New York, NY USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84 | 2018年 / 84卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We unify two prominent lines of work on multi-armed bandits: bandits with knapsacks and combinatorial semi-bandits. The former concerns limited "resources" consumed by the algorithm, e.g., limited supply in dynamic pricing. The latter allows a huge number of actions but assumes combinatorial structure and additional feedback to make the problem tractable. We define a common generalization, support it with several motivating examples, and design an algorithm for it. Our regret bounds are comparable with those for BwK and combinatorial semi-bandits.

引用

页数：11

共 50 条

[1] Thompson Sampling for Combinatorial Semi-Bandits
Wang, Siwei
Chen, Wei
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[2] (Locally) Differentially Private Combinatorial Semi-Bandits
Chen, Xiaoyu
Zheng, Kai
Zhou, Zixin
Yang, Yunchang
Chen, Wei
Wang, Liwei
[J]. 25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[3] (Locally) Differentially Private Combinatorial Semi-Bandits
Chen, Xiaoyu
Zheng, Kai
Zhou, Zixin
Yang, Yunchang
Chen, Wei
Wang, Liwei
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[4] Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits
Ito, Shinji
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[5] Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits
Perrault, Pierre
Boursier, Etienne
Perchet, Vianney
Valko, Michal
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[6] Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits
Kveton, Branislav
Wen, Zheng
Ashkan, Azin
Szepesvari, Csaba
[J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 535 - 543
[7] Matching with semi-bandits
Kasy, Maximilian
Teytelboym, Alexander
[J]. ECONOMETRICS JOURNAL, 2023, 26 (01): : 45 - 66
[8] Efficient Learning in Large-Scale Combinatorial Semi-Bandits
Wen, Zheng
Kveton, Branislav
Ashkan, Azin
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 1113 - 1122
[9] Asymptotically Optimal Strategies For Combinatorial Semi-Bandits in Polynomial Time
Cuvelier, Thibaut
Combes, Richard
Gourdin, Eric
[J]. ALGORITHMIC LEARNING THEORY, VOL 132, 2021, 132
[10] A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
van der Hoeven, Dirk
Zierahn, Lukas
Lancewicki, Tal
Rosenberg, Aviv
Cesa-Bianchi, Nicolo
[J]. THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195

← 1 2 3 4 5 →