Minimax lower bounds for the two-armed bandit problem

被引：0

作者：

Kulkarni, SR ^{[1
]}

Lugosi, G ^{[1
]}

机构：

[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA

来源：

PROCEEDINGS OF THE 36TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5 | 1997年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We obtain minimax lower bounds on the regret for the classical two-armed bandit problem. We provide a finite-sample minimax version of the well-known log n asymptotic lower bound of Lai and Robbins. Also, in contrast to the log n asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable configurations of the two arms. That is, we show that for every allocation rule and for every n, there is a configuration such that the regret at time n is at least 1-epsilon times the regret of random guessing, where epsilon is any small positive constant.

引用

页码：2293 / 2297

页数：5

共 50 条

[1] Finite-time lower bounds for the two-armed bandit problem
Kulkarni, SR
Lugosi, G
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2000, 45 (04) : 711 - 714
[2] Finding minimax strategy and minimax risk in a random environment (the two-armed bandit problem)
Kolnogorov, A. V.
[J]. AUTOMATION AND REMOTE CONTROL, 2011, 72 (05) : 1017 - 1027
[3] Finding minimax strategy and minimax risk in a random environment (the two-armed bandit problem)
A. V. Kolnogorov
[J]. Automation and Remote Control, 2011, 72 : 1017 - 1027
[4] Minimax Normal Two-Armed Bandit with Indefinite Control Horizon
Kolnogorov, Alexander
[J]. 2016 INTERNATIONAL CONFERENCE APPLIED MATHEMATICS, COMPUTATIONAL SCIENCE AND SYSTEMS ENGINEERING, 2017, 9
[5] Two-armed bandit problem for parallel data processing systems
Kolnogorov, A. V.
[J]. PROBLEMS OF INFORMATION TRANSMISSION, 2012, 48 (01) : 72 - 84
[6] Two-armed bandit problem for parallel data processing systems
A. V. Kolnogorov
[J]. Problems of Information Transmission, 2012, 48 : 72 - 84
[7] A confirmation of a conjecture on Feldman's two-armed bandit problem
Chen, Zengjing
Lin, Yiwei
Zhang, Jichen
[J]. JOURNAL OF APPLIED PROBABILITY, 2024, 61 (01) : 121 - 136
[8] A Bayesian two-armed bandit model
Wang, Xikui
Liang, You
Porth, Lysa
[J]. APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2019, 35 (03) : 624 - 636
[9] Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm
Kolnogorov, A., V
Nazin, A., V
Shiyan, D. N.
[J]. AUTOMATION AND REMOTE CONTROL, 2022, 83 (08) : 1288 - 1307
[10] Two-Armed Bandit Problem and Batch Version of the Mirror Descent Algorithm
A. V. Kolnogorov
A. V. Nazin
D. N. Shiyan
[J]. Automation and Remote Control, 2022, 83 : 1288 - 1307

← 1 2 3 4 5 →