A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

被引:0
|
作者
Barrier, Antoine [1 ,2 ]
Garivier, Aurelien [1 ]
Kocak, Tomas [3 ]
机构
[1] Univ Lyon, ENS Lyon, UMPA UMR 5669, Lyon, France
[2] Univ Paris Saclay, CNRS, Lab Math, F-91405 Orsay, France
[3] Univ Potsdam, Inst Math, Potsdam, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy, called EXPLORATION-BIASED SAMPLING, is not only asymptotically optimal: it is to the best of our knowledge the first strategy with non-asymptotic bounds that asymptotically matches the sample complexity. But the main advantage over other algorithms like TRACK-AND-STOP is an improved behavior regarding exploration: EXPLORATION-BIASED SAMPLING is biased towards exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest.
引用
下载
收藏
页数:32
相关论文
共 50 条
  • [1] Best-Arm Identification in Linear Bandits
    Soare, Marta
    Lazaric, Alessandro
    Munos, Remi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [2] Optimal Best-arm Identification in Linear Bandits
    Jedra, Yassir
    Proutiere, Alexandre
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [3] Best-Arm Identification in Correlated Multi-Armed Bandits
    Gupta S.
    Joshi G.
    Yagan O.
    IEEE Journal on Selected Areas in Information Theory, 2021, 2 (02): : 549 - 563
  • [4] A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity
    Xiong, Zhihan
    Camilleri, Romain
    Fazel, Maryam
    Jain, Lalit
    Jamieson, Kevin
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [5] On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits
    Barrier, Antoine
    Garivier, Aurelien
    Stoltz, Gilles
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 136 - 181
  • [6] On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits
    Shahrampour, Shahin
    Noshad, Mohammad
    Tarokh, Vahid
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2017, 65 (16) : 4281 - 4292
  • [7] Best-arm Identification Algorithms for Multi-Armed Bandits in the Fixed Confidence Setting
    Jamieson, Kevin
    Nowak, Robert
    2014 48TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2014,
  • [8] ε-Best-Arm Identification in Pay-Per-Reward Multi-Armed Bandits
    Sabato, Sivan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [9] Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme
    Nikolakakis, Konstantinos E.
    Kalogerias, Dionysios S.
    Sheffet, Or
    Sarwate, Anand D.
    Nikolakakis, Konstantinos E. (k.nikolakakis@rutgers.edu), 1600, Institute of Electrical and Electronics Engineers Inc. (02): : 534 - 548
  • [10] Dealing with Unknown Variances in Best-Arm Identification
    Jourdan, Marc
    Degenne, Remy
    Kaufmann, Emilie
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 776 - 849