A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

被引:0
|
作者
Barrier, Antoine [1 ,2 ]
Garivier, Aurelien [1 ]
Kocak, Tomas [3 ]
机构
[1] Univ Lyon, ENS Lyon, UMPA UMR 5669, Lyon, France
[2] Univ Paris Saclay, CNRS, Lab Math, F-91405 Orsay, France
[3] Univ Potsdam, Inst Math, Potsdam, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance. This strategy, called EXPLORATION-BIASED SAMPLING, is not only asymptotically optimal: it is to the best of our knowledge the first strategy with non-asymptotic bounds that asymptotically matches the sample complexity. But the main advantage over other algorithms like TRACK-AND-STOP is an improved behavior regarding exploration: EXPLORATION-BIASED SAMPLING is biased towards exploration in a subtle but natural way that makes it more stable and interpretable. These improvements are allowed by a new analysis of the sample complexity optimization problem, which yields a faster numerical resolution scheme and several quantitative regularity results that we believe of high independent interest.
引用
下载
收藏
页数:32
相关论文
共 50 条
  • [31] A Tutorial on the Non-Asymptotic Theory of System Identification
    Ziemann, Ingvar
    Tsiamis, Anastasios
    Lee, Bruce
    Jedra, Yassir
    Matni, Nikolai
    Pappas, George J.
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 8921 - 8939
  • [32] Non-asymptotic Results for Singular Values of Gaussian Matrix Products
    Boris Hanin
    Grigoris Paouris
    Geometric and Functional Analysis, 2021, 31 : 268 - 324
  • [33] Best Arm Identification in Sample-path Correlated Bandits
    Prakash, R. Sri
    Karamchandani, Nikhil
    Moharir, Sharayu
    2022 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2022, : 7 - 12
  • [34] Optimal Best Arm Identification with Fixed Confidence in Restless Bandits
    Karthik P.N.
    Tan V.Y.F.
    Mukherjee A.
    Tajer A.
    IEEE Transactions on Information Theory, 2024, 70 (10) : 1 - 1
  • [35] Model-Based Best Arm Identification for Decreasing Bandits
    Takemori, Sho
    Umeda, Yuhei
    Gopalan, Aditya
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [36] Best Arm Identification in Linear Bandits with Linear Dimension Dependency
    Tao, Chao
    Blanco, Saul A.
    Zhou, Yuan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [37] Secure Best Arm Identification in Multi-armed Bandits
    Ciucanu, Radu
    Lafourcade, Pascal
    Lombard-Platet, Marius
    Soare, Marta
    INFORMATION SECURITY PRACTICE AND EXPERIENCE, ISPEC 2019, 2019, 11879 : 152 - 171
  • [38] Non-asymptotic approach to varying coefficient model
    Klopp, Olga
    Pensky, Marianna
    ELECTRONIC JOURNAL OF STATISTICS, 2013, 7 : 454 - 479
  • [39] NON-ASYMPTOTIC RESULTS FOR SINGULAR VALUES OF GAUSSIAN MATRIX PRODUCTS
    Hanin, Boris
    Paouris, Grigoris
    GEOMETRIC AND FUNCTIONAL ANALYSIS, 2021, 31 (02) : 268 - 324
  • [40] Guaranteed non-asymptotic confidence regions in system identification
    Campi, MC
    Weyer, E
    AUTOMATICA, 2005, 41 (10) : 1751 - 1764