A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

被引:0
|
作者
Barrier, Antoine [1 ,2 ]
Garivier, Aurelien [1 ]
Kocak, Tomas [3 ]
机构
[1] Univ Lyon, ENS Lyon, UMPA UMR 5669, Lyon, France
[2] Univ Paris Saclay, CNRS, Lab Math, F-91405 Orsay, France
[3] Univ Potsdam, Inst Math, Potsdam, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy, called EXPLORATION-BIASED SAMPLING, is not only asymptotically optimal: it is to the best of our knowledge the first strategy with non-asymptotic bounds that asymptotically matches the sample complexity. But the main advantage over other algorithms like TRACK-AND-STOP is an improved behavior regarding exploration: EXPLORATION-BIASED SAMPLING is biased towards exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest.
引用
下载
收藏
页数:32
相关论文
共 50 条
  • [41] Non-asymptotic sub-Gaussian error bounds for hypothesis testing
    Li, Yanpeng
    Tian, Boping
    STATISTICS & PROBABILITY LETTERS, 2022, 189
  • [42] A Refined Non-Asymptotic Tail Bound of Sub-Gaussian Matrix
    Xianjie GAO
    Chao ZHANG
    Hongwei ZHANG
    Journal of Mathematical Research with Applications, 2020, 40 (05) : 543 - 550
  • [43] Best arm identification in multi-armed bandits with delayed feedback
    Grover, Aditya
    Markov, Todor
    Attia, Peter
    Jin, Norman
    Perkins, Nicholas
    Cheong, Bryan
    Chen, Michael
    Yang, Zi
    Harris, Stephen
    Chueh, William
    Ermon, Stefano
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [44] SPRT-Based Efficient Best Arm Identification in Stochastic Bandits
    Mukherjee A.
    Tajer A.
    IEEE Journal on Selected Areas in Information Theory, 2023, 4 : 128 - 143
  • [45] Non-asymptotic Gaussian estimates for the recursive approximation of the invariant distribution of a diffusion
    Honor, I
    Menozzi, S.
    Pages, G.
    ANNALES DE L INSTITUT HENRI POINCARE-PROBABILITES ET STATISTIQUES, 2020, 56 (03): : 1559 - 1605
  • [46] Best Arm Identification in Restless Markov Multi-Armed Bandits
    Karthik, P. N.
    Reddy, Kota Srinivas
    Tan, Vincent Y. F.
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2023, 69 (05) : 3240 - 3262
  • [47] Non-Asymptotic Achievable Rates for Gaussian Energy-Harvesting Channels: Best-Effort and Save-and-Transmit
    Fong, Silas L.
    Yang, Jing
    Yener, Aylin
    2018 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2018, : 871 - 875
  • [48] Non-Asymptotic Achievable Rates for Gaussian Energy-Harvesting Channels: Save-and-Transmit and Best-Effort
    Fong, Silas L.
    Yang, Jing
    Yener, Aylin
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2019, 65 (11) : 7233 - 7252
  • [49] Non-asymptotic System Identification for Linear Systems with Nonlinear Policies
    Li, Yingying
    Zhang, Tianpeng
    Das, Subhro
    Shamma, Jeff
    Li, Na
    IFAC PAPERSONLINE, 2023, 56 (02): : 1672 - 1679
  • [50] Non-asymptotic Identification of LTI Systems from a Single Trajectory
    Oymak, Samet
    Ozay, Necmiye
    2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 5655 - 5661