Two-armed bandit strategies that discount past and future

被引：0

作者：

Ginebra, J ^{[1
]}

机构：

[1] Univ Politecn Cataluna, Dept Stat & OR, E-08028 Barcelona, Spain

来源：

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION | 2004年 / 33卷 / 03期

关键词：

myopic strategies; sequential optimization; finite memory; non-stationary; backward induction; m-step ahead;

D O I：

10.1081/SAC-200033347

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

We explore the small-sample performance of m-step look ahead strategies that only use the k most recent observations, for the Bernoulli two-armed bandit problem. The larger k is not always the better. Strategies with small k and m perform almost as well as the optimal strategies, and they dominate the optimal one over a non-negligible part of the parameter space. Given that they adapt better to unplanned non-stationarities and outperform the optimal one under small prior misspecifications, they will be hard to beat in applications.

引用

页码：609 / 619

页数：11

共 50 条

[1] A Bayesian two-armed bandit model
Wang, Xikui
Liang, You
Porth, Lysa
[J]. APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2019, 35 (03) : 624 - 636
[2] Poissonian Two-Armed Bandit: A New Approach
A. V. Kolnogorov
[J]. Problems of Information Transmission, 2022, 58 : 160 - 183
[3] Gaussian Two-Armed Bandit: Limiting Description
Kolnogorov, A. V.
[J]. PROBLEMS OF INFORMATION TRANSMISSION, 2020, 56 (03) : 278 - 301
[4] Gaussian Two-Armed Bandit: Limiting Description
A. V. Kolnogorov
[J]. Problems of Information Transmission, 2020, 56 : 278 - 301
[5] Poissonian Two-Armed Bandit: A New Approach
Kolnogorov, A., V
[J]. PROBLEMS OF INFORMATION TRANSMISSION, 2022, 58 (02) : 160 - 183
[6] Noradrenergic Regulation of Two-Armed Bandit Performance
Swanson, Kyra
Averbeck, Bruno B.
Laubach, Mark
[J]. BEHAVIORAL NEUROSCIENCE, 2022, 136 (01) : 84 - 99
[7] Small-sample performance of Bernoulli two-armed bandit Bayesian strategies
Ginebra, J
Clayton, MK
[J]. JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1999, 79 (01) : 107 - 122
[8] Minimax lower bounds for the two-armed bandit problem
Kulkarni, SR
Lugosi, G
[J]. PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 1997, : 2293 - 2297
[9] On the Conjecture of Berry Regarding a Bernoulli Two-Armed Bandit
Zhang, Jichen
Wu, Panyu
[J]. MATHEMATICS, 2023, 11 (03)
[10] When can the two-armed bandit algorithm be trusted?
Lamberton, D
Pagès, G
Tarrès, P
[J]. ANNALS OF APPLIED PROBABILITY, 2004, 14 (03): : 1424 - 1454

← 1 2 3 4 5 →