Combinatorial Bandits under Strategic Manipulations

被引:1
|
作者
Dong, Jing [1 ]
Li, Ke [1 ]
Li, Shuai [2 ]
Wang, Baoxiang [1 ]
机构
[1] Chinese Univ Hong Kong, Shenzhen, Peoples R China
[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
multi-armed bandits; strategic manipulations; crowdsourcing; online information maximization; recommendation systems;
D O I
10.1145/3488560.3498413
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Strategic behavior against sequential learning methods, such as "click framing" in real recommendation systems, have been widely observed. Motivated by such behavior we study the problem of combinatorial multi-armed bandits (CMAB) under strategic manipulations of rewards, where each arm can modify the emitted reward signals for its own interest. This characterization of the adversarial behavior is a relaxation of previously well-studied settings such as adversarial attacks and adversarial corruption. We propose a strategic variant of the combinatorial UCB algorithm, which has a regret of at most O (m log T + mB(max)) under strategic manipulations, where.. is the time horizon, m is the number of arms, and B-max is the maximum budget of an arm. We provide lower bounds on the budget for arms to incur certain regret of the bandit algorithm. Extensive experiments on online worker selection for crowdsourcing systems, online influence maximization and online recommendations with both synthetic and real datasets corroborate our theoretical findings on robustness and regret bounds, in a variety of regimes of manipulation budgets.
引用
收藏
页码:219 / 229
页数:11
相关论文
共 50 条
  • [1] Combinatorial bandits
    Cesa-Bianchi, Nicolo
    Lugosi, Gabor
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2012, 78 (05) : 1404 - 1422
  • [2] Combinatorial Cascading Bandits
    Kveton, Branislav
    Wen, Zheng
    Ashkan, Azin
    Szepesvari, Csaba
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [3] Combinatorial Bandits Revisited
    Combes, Richard
    Talebi, M. Sadegh
    Proutiere, Alexandre
    Lelarge, Marc
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [4] Combinatorial Causal Bandits
    Feng, Shi
    Chen, Wei
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 7550 - 7558
  • [5] Contextual Combinatorial Cascading Bandits
    Li, Shuai
    Wang, Baoxiang
    Zhang, Shengyu
    Chen, Wei
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [6] Combinatorial Bandits with Relative Feedback
    Saha, Aadirupa
    Gopalan, Aditya
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [7] Strategic experimentation with Poisson bandits
    Keller, Godfrey
    Rady, Sven
    [J]. THEORETICAL ECONOMICS, 2010, 5 (02) : 275 - 311
  • [8] Strategic observation with exponential bandits
    Marlats, Chantal
    Menager, Lucie
    [J]. JOURNAL OF ECONOMIC THEORY, 2021, 193
  • [9] Strategic experimentation with exponential bandits
    Keller, G
    Rady, S
    Cripps, M
    [J]. ECONOMETRICA, 2005, 73 (01) : 39 - 68
  • [10] Combinatorial Pure Exploration for Dueling Bandits
    Chen, Wei
    Du, Yihan
    Huang, Longbo
    Zhao, Haoyu
    [J]. 25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,