Combinatorial bandits

被引:208
|
作者
Cesa-Bianchi, Nicolo [1 ]
Lugosi, Gabor [2 ,3 ]
机构
[1] Univ Milan, I-20122 Milan, Italy
[2] ICREA, Barcelona, Spain
[3] Pompeu Fabra Univ, Barcelona, Spain
关键词
Online prediction; Adversarial bandit problems; Online linear optimization; ALGORITHMS;
D O I
10.1016/j.jcss.2012.01.001
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We study sequential prediction problems in which, at each time instance, the forecaster chooses a vector from a given finite set S subset of R-d. At the same time, the opponent chooses a "loss" vector in R-d and the forecaster suffers a loss that is the inner product of the two vectors. The goal of the forecaster is to achieve that, in the long run, the accumulated loss is not much larger than that of the best possible element in S. We consider the "bandit" setting in which the forecaster only has access to the losses of the chosen vectors (i.e., the entire loss vectors are not observed). We introduce a variant of a strategy by Dani, Hayes and Kakade achieving a regret bound that, for a variety of concrete choices of S, is of order root nd ln vertical bar S vertical bar where n is the time horizon. This is not improvable in general and is better than previously known bounds. The examples we consider are all such that S subset of {0. 1}(d), and we show how the combinatorial structure of these classes can be exploited to improve the regret bounds. We also point out computationally efficient implementations for various interesting choices of S. (C) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:1404 / 1422
页数:19
相关论文
共 50 条
  • [1] Combinatorial Cascading Bandits
    Kveton, Branislav
    Wen, Zheng
    Ashkan, Azin
    Szepesvari, Csaba
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [2] Combinatorial Bandits Revisited
    Combes, Richard
    Talebi, M. Sadegh
    Proutiere, Alexandre
    Lelarge, Marc
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [3] Combinatorial Causal Bandits
    Feng, Shi
    Chen, Wei
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7550 - 7558
  • [4] Contextual Combinatorial Cascading Bandits
    Li, Shuai
    Wang, Baoxiang
    Zhang, Shengyu
    Chen, Wei
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [5] Combinatorial Bandits with Relative Feedback
    Saha, Aadirupa
    Gopalan, Aditya
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [6] Combinatorial Pure Exploration for Dueling Bandits
    Chen, Wei
    Du, Yihan
    Huang, Longbo
    Zhao, Haoyu
    [J]. 25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [7] Combinatorial Bandits under Strategic Manipulations
    Dong, Jing
    Li, Ke
    Li, Shuai
    Wang, Baoxiang
    [J]. WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 219 - 229
  • [8] Combinatorial Semi-Bandits with Knapsacks
    Sankararaman, Karthik Abinav
    Slivkins, Aleksandrs
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [9] Combinatorial Pure Exploration for Dueling Bandits
    Chen, Wei
    Du, Yihan
    Huang, Longbo
    Zhao, Haoyu
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [10] Combinatorial Sleeping Bandits With Fairness Constraints
    Li, Fengjiao
    Liu, Jia
    Ji, Bo
    [J]. IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2020, 7 (03): : 1799 - 1813