On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems

被引:0
|
作者
Kim, Baekjin [1 ]
Tewari, Ambuj [1 ]
机构
[1] Univ Michigan, Dept Stat, Ann Arbor, MI 48109 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the optimality of perturbation based algorithms in the stochastic and adversarial multi-armed bandit problems. For the stochastic case, we provide a unified regret analysis for both sub-Weibull and bounded perturbations when rewards are sub-Gaussian. Our bounds are instance optimal for sub-Weibull perturbations with parameter 2 that also have a matching lower tail bound, and all bounded support perturbations where there is sufficient probability mass at the extremes of the support. For the adversarial setting, we prove rigorous barriers against two natural solution approaches using tools from discrete choice theory and extreme value theory. Our results suggest that the optimal perturbation, if it exists, will be of Frechet-type.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Gaussian multi-armed bandit problems with multiple objectives
    Reverdy, Paul
    2016 AMERICAN CONTROL CONFERENCE (ACC), 2016, : 5263 - 5269
  • [22] Finite budget analysis of multi-armed bandit problems
    Xia, Yingce
    Qin, Tao
    Ding, Wenkui
    Li, Haifang
    Zhang, Xudong
    Yu, Nenghai
    Liu, Tie-Yan
    NEUROCOMPUTING, 2017, 258 : 13 - 29
  • [23] Achieving Complete Learning in Multi-Armed Bandit Problems
    Vakili, Sattar
    Zhao, Qing
    2013 ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2013, : 1778 - 1782
  • [25] The multi-armed bandit, with constraints
    Eric V. Denardo
    Eugene A. Feinberg
    Uriel G. Rothblum
    Annals of Operations Research, 2013, 208 : 37 - 62
  • [26] The multi-armed bandit, with constraints
    Denardo, Eric V.
    Feinberg, Eugene A.
    Rothblum, Uriel G.
    ANNALS OF OPERATIONS RESEARCH, 2013, 208 (01) : 37 - 62
  • [27] The Assistive Multi-Armed Bandit
    Chan, Lawrence
    Hadfield-Menell, Dylan
    Srinivasa, Siddhartha
    Dragan, Anca
    HRI '19: 2019 14TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2019, : 354 - 363
  • [28] Multi-armed bandit games
    Gursoy, Kemal
    ANNALS OF OPERATIONS RESEARCH, 2024,
  • [29] The non-stationary stochastic multi-armed bandit problem
    Allesiardo R.
    Féraud R.
    Maillard O.-A.
    Allesiardo, Robin (robin.allesiardo@gmail.com), 1600, Springer Science and Business Media Deutschland GmbH (03): : 267 - 283
  • [30] Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits
    Wang, Zhiwei
    Wang, Huazheng
    Wang, Hongning
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15770 - 15777