Minimax lower bounds for the two-armed bandit problem

被引:0
|
作者
Kulkarni, SR [1 ]
Lugosi, G [1 ]
机构
[1] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We obtain minimax lower bounds on the regret for the classical two-armed bandit problem. We provide a finite-sample minimax version of the well-known log n asymptotic lower bound of Lai and Robbins. Also, in contrast to the log n asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable configurations of the two arms. That is, we show that for every allocation rule and for every n, there is a configuration such that the regret at time n is at least 1-epsilon times the regret of random guessing, where epsilon is any small positive constant.
引用
收藏
页码:2293 / 2297
页数:5
相关论文
共 50 条