Two-armed bandit strategies that discount past and future

被引:0
|
作者
Ginebra, J [1 ]
机构
[1] Univ Politecn Cataluna, Dept Stat & OR, E-08028 Barcelona, Spain
关键词
myopic strategies; sequential optimization; finite memory; non-stationary; backward induction; m-step ahead;
D O I
10.1081/SAC-200033347
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We explore the small-sample performance of m-step look ahead strategies that only use the k most recent observations, for the Bernoulli two-armed bandit problem. The larger k is not always the better. Strategies with small k and m perform almost as well as the optimal strategies, and they dominate the optimal one over a non-negligible part of the parameter space. Given that they adapt better to unplanned non-stationarities and outperform the optimal one under small prior misspecifications, they will be hard to beat in applications.
引用
收藏
页码:609 / 619
页数:11
相关论文
共 50 条
  • [1] A Bayesian two-armed bandit model
    Wang, Xikui
    Liang, You
    Porth, Lysa
    [J]. APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2019, 35 (03) : 624 - 636
  • [2] Poissonian Two-Armed Bandit: A New Approach
    A. V. Kolnogorov
    [J]. Problems of Information Transmission, 2022, 58 : 160 - 183
  • [3] Gaussian Two-Armed Bandit: Limiting Description
    Kolnogorov, A. V.
    [J]. PROBLEMS OF INFORMATION TRANSMISSION, 2020, 56 (03) : 278 - 301
  • [4] Gaussian Two-Armed Bandit: Limiting Description
    A. V. Kolnogorov
    [J]. Problems of Information Transmission, 2020, 56 : 278 - 301
  • [5] Poissonian Two-Armed Bandit: A New Approach
    Kolnogorov, A., V
    [J]. PROBLEMS OF INFORMATION TRANSMISSION, 2022, 58 (02) : 160 - 183
  • [6] Noradrenergic Regulation of Two-Armed Bandit Performance
    Swanson, Kyra
    Averbeck, Bruno B.
    Laubach, Mark
    [J]. BEHAVIORAL NEUROSCIENCE, 2022, 136 (01) : 84 - 99
  • [7] Small-sample performance of Bernoulli two-armed bandit Bayesian strategies
    Ginebra, J
    Clayton, MK
    [J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1999, 79 (01) : 107 - 122
  • [8] Minimax lower bounds for the two-armed bandit problem
    Kulkarni, SR
    Lugosi, G
    [J]. PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 1997, : 2293 - 2297
  • [9] On the Conjecture of Berry Regarding a Bernoulli Two-Armed Bandit
    Zhang, Jichen
    Wu, Panyu
    [J]. MATHEMATICS, 2023, 11 (03)
  • [10] When can the two-armed bandit algorithm be trusted?
    Lamberton, D
    Pagès, G
    Tarrès, P
    [J]. ANNALS OF APPLIED PROBABILITY, 2004, 14 (03): : 1424 - 1454