The multi-armed bandit, with constraints

被引：0

作者：

Eric V. Denardo

Eugene A. Feinberg

Uriel G. Rothblum

机构：

[1] Yale University,Center for Systems Sciences

[2] Stony Brook University,Department of Applied Mathematics and Statistics

[3] Technion—Israel Institute of Technology,Late of the Faculty of Industrial Engineering and Management

来源：

Annals of Operations Research | 2013年 / 208卷

关键词：

Optimal Policy; Column Generation; Priority Rule; Initial Randomization; Bandit Problem;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Presented in this paper is a self-contained analysis of a Markov decision problem that is known as the multi-armed bandit. The analysis covers the cases of linear and exponential utility functions. The optimal policy is shown to have a simple and easily-implemented form. Procedures for computing such a policy are presented, as are procedures for computing the expected utility that it earns, given any starting state. For the case of linear utility, constraints that link the bandits are introduced, and the constrained optimization problem is solved via column generation. The methodology is novel in several respects, which include the use of elementary row operations to simplify arguments.

引用

页码：37 / 62

页数：25

共 50 条

[1] The multi-armed bandit, with constraints
Denardo, Eric V.
Feinberg, Eugene A.
Rothblum, Uriel G.
ANNALS OF OPERATIONS RESEARCH, 2013, 208 (01) : 37 - 62
[2] The Assistive Multi-Armed Bandit
Chan, Lawrence
Hadfield-Menell, Dylan
Srinivasa, Siddhartha
Dragan, Anca
HRI '19: 2019 14TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION, 2019, : 354 - 363
[3] Multi-armed bandit games
Gursoy, Kemal
ANNALS OF OPERATIONS RESEARCH, 2024,
[4] Dynamic Multi-Armed Bandit with Covariates
Pavlidis, Nicos G.
Tasoulis, Dimitris K.
Adams, Niall M.
Hand, David J.
ECAI 2008, PROCEEDINGS, 2008, 178 : 777 - +
[5] Scaling Multi-Armed Bandit Algorithms
Fouche, Edouard
Komiyama, Junpei
Boehm, Klemens
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 1449 - 1459
[6] The budgeted multi-armed bandit problem
Madani, O
Lizotte, DJ
Greiner, R
LEARNING THEORY, PROCEEDINGS, 2004, 3120 : 643 - 645
[7] The Multi-Armed Bandit With Stochastic Plays
Lesage-Landry, Antoine
Taylor, Joshua A.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2018, 63 (07) : 2280 - 2286
[8] Satisficing in Multi-Armed Bandit Problems
Reverdy, Paul
Srivastava, Vaibhav
Leonard, Naomi Ehrich
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (08) : 3788 - 3803
[9] MULTI-ARMED BANDIT ALLOCATION INDEXES
JONES, PW
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 1989, 40 (12) : 1158 - 1159
[10] Multi-armed Bandit with Additional Observations
Yun, Donggyu
Proutiere, Alexandre
Ahn, Sumyeong
Shin, Jinwoo
Yi, Yung
PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2018, 2 (01)

← 1 2 3 4 5 →