A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

被引:0
|
作者
Barrier, Antoine [1 ,2 ]
Garivier, Aurelien [1 ]
Kocak, Tomas [3 ]
机构
[1] Univ Lyon, ENS Lyon, UMPA UMR 5669, Lyon, France
[2] Univ Paris Saclay, CNRS, Lab Math, F-91405 Orsay, France
[3] Univ Potsdam, Inst Math, Potsdam, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy, called EXPLORATION-BIASED SAMPLING, is not only asymptotically optimal: it is to the best of our knowledge the first strategy with non-asymptotic bounds that asymptotically matches the sample complexity. But the main advantage over other algorithms like TRACK-AND-STOP is an improved behavior regarding exploration: EXPLORATION-BIASED SAMPLING is biased towards exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest.
引用
收藏
页数:32
相关论文
共 50 条
  • [21] Best arm identification in generalized linear bandits
    Kazerouni, Abbas
    Wein, Lawrence M.
    OPERATIONS RESEARCH LETTERS, 2021, 49 (03) : 365 - 371
  • [22] Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies
    Libin, Pieter J. K.
    Verstraeten, Timothy
    Roijers, Diederik M.
    Grujic, Jelena
    Theys, Kristof
    Lemey, Philippe
    Nowe, Ann
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT III, 2019, 11053 : 456 - 471
  • [23] Optimal Best-Arm Identification Methods for Tail-Risk Measures
    Agrawal, Shubhada
    Koolen, Wouter M.
    Juneja, Sandeep
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [24] Fixed-Confidence Guarantees for Bayesian Best-Arm Identification
    Shang, Xuedong
    de Heide, Rianne
    Kaufmann, Emilie
    Menard, Pierre
    Valko, Michal
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
  • [25] On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models
    Kaufmann, Emilie
    Cappe, Olivier
    Garivier, Aurelien
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [26] A non-asymptotic approach to local modelling
    Roll, J
    Nazin, A
    Ljung, L
    PROCEEDINGS OF THE 41ST IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, 2002, : 638 - 643
  • [27] Fixed-Budget Best-Arm Identification with Heterogeneous Reward Variances
    Lalitha, Anusha
    Kalantari, Kousha
    Ma, Yifei
    Deoras, Anoop
    Kveton, Branislav
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1164 - 1173
  • [28] Best-Arm Identification Using Extreme Value Theory Estimates of the CVaR
    Troop, Dylan
    Godin, Frederic
    Yu, Jia Yuan
    JOURNAL OF RISK AND FINANCIAL MANAGEMENT, 2022, 15 (04)
  • [29] On Gap-Based Lower Bounding Techniques for Best-Arm Identification
    Truong, Lan, V
    Scarlett, Jonathan
    ENTROPY, 2020, 22 (07)
  • [30] Sequential estimation of quantiles with applications to A/B testing and best-arm identification
    Howard, Steven R.
    Ramdas, Aaditya
    BERNOULLI, 2022, 28 (03) : 1704 - 1728