Combinatorial bandits

被引：208

作者：

Cesa-Bianchi, Nicolo ^{[1
]}

Lugosi, Gabor ^{[2
,3
]}

机构：

[1] Univ Milan, I-20122 Milan, Italy

[2] ICREA, Barcelona, Spain

[3] Pompeu Fabra Univ, Barcelona, Spain

来源：

JOURNAL OF COMPUTER AND SYSTEM SCIENCES | 2012年 / 78卷 / 05期

关键词：

Online prediction; Adversarial bandit problems; Online linear optimization; ALGORITHMS;

D O I：

10.1016/j.jcss.2012.01.001

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We study sequential prediction problems in which, at each time instance, the forecaster chooses a vector from a given finite set S subset of R-d. At the same time, the opponent chooses a "loss" vector in R-d and the forecaster suffers a loss that is the inner product of the two vectors. The goal of the forecaster is to achieve that, in the long run, the accumulated loss is not much larger than that of the best possible element in S. We consider the "bandit" setting in which the forecaster only has access to the losses of the chosen vectors (i.e., the entire loss vectors are not observed). We introduce a variant of a strategy by Dani, Hayes and Kakade achieving a regret bound that, for a variety of concrete choices of S, is of order root nd ln vertical bar S vertical bar where n is the time horizon. This is not improvable in general and is better than previously known bounds. The examples we consider are all such that S subset of {0. 1}(d), and we show how the combinatorial structure of these classes can be exploited to improve the regret bounds. We also point out computationally efficient implementations for various interesting choices of S. (C) 2012 Elsevier Inc. All rights reserved.

引用

页码：1404 / 1422

页数：19

共 50 条

[1] Combinatorial Cascading Bandits
Kveton, Branislav
Wen, Zheng
Ashkan, Azin
Szepesvari, Csaba
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
[2] Combinatorial Bandits Revisited
Combes, Richard
Talebi, M. Sadegh
Proutiere, Alexandre
Lelarge, Marc
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
[3] Combinatorial Causal Bandits
Feng, Shi
Chen, Wei
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7550 - 7558
[4] Contextual Combinatorial Cascading Bandits
Li, Shuai
Wang, Baoxiang
Zhang, Shengyu
Chen, Wei
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
[5] Combinatorial Bandits with Relative Feedback
Saha, Aadirupa
Gopalan, Aditya
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[6] Combinatorial Pure Exploration for Dueling Bandits
Chen, Wei
Du, Yihan
Huang, Longbo
Zhao, Haoyu
[J]. 25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[7] Combinatorial Bandits under Strategic Manipulations
Dong, Jing
Li, Ke
Li, Shuai
Wang, Baoxiang
[J]. WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 219 - 229
[8] Combinatorial Semi-Bandits with Knapsacks
Sankararaman, Karthik Abinav
Slivkins, Aleksandrs
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[9] Combinatorial Pure Exploration for Dueling Bandits
Chen, Wei
Du, Yihan
Huang, Longbo
Zhao, Haoyu
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[10] Combinatorial Sleeping Bandits With Fairness Constraints
Li, Fengjiao
Liu, Jia
Ji, Bo
[J]. IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2020, 7 (03): : 1799 - 1813

← 1 2 3 4 5 →