The multi-armed bandit, with constraints

被引：0

作者：

Eric V. Denardo

Eugene A. Feinberg

Uriel G. Rothblum

机构：

[1] Yale University,Center for Systems Sciences

[2] Stony Brook University,Department of Applied Mathematics and Statistics

[3] Technion—Israel Institute of Technology,Late of the Faculty of Industrial Engineering and Management

来源：

Annals of Operations Research | 2013年 / 208卷

关键词：

Optimal Policy; Column Generation; Priority Rule; Initial Randomization; Bandit Problem;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Presented in this paper is a self-contained analysis of a Markov decision problem that is known as the multi-armed bandit. The analysis covers the cases of linear and exponential utility functions. The optimal policy is shown to have a simple and easily-implemented form. Procedures for computing such a policy are presented, as are procedures for computing the expected utility that it earns, given any starting state. For the case of linear utility, constraints that link the bandits are introduced, and the constrained optimization problem is solved via column generation. The methodology is novel in several respects, which include the use of elementary row operations to simplify arguments.

引用

页码：37 / 62

页数：25

共 50 条

[21] Robust control of the multi-armed bandit problem
Caro, Felipe
Das Gupta, Aparupa
ANNALS OF OPERATIONS RESEARCH, 2022, 317 (02) : 461 - 480
[22] Achieving Privacy in the Adversarial Multi-Armed Bandit
Tossou, Aristide C. Y.
Dimitrakakis, Christos
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2653 - 2659
[23] Generic Outlier Detection in Multi-Armed Bandit
Ban, Yikun
He, Jingrui
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 913 - 923
[24] Anytime Algorithms for Multi-Armed Bandit Problems
Kleinberg, Robert
PROCEEDINGS OF THE SEVENTHEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2006, : 928 - 936
[25] An Adaptive Algorithm in Multi-Armed Bandit Problem
Zhang X.
Zhou Q.
Liang B.
Xu J.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (03): : 643 - 654
[26] A Multi-Armed Bandit Strategy for Countermeasure Selection
Cochrane, Madeleine
Hunjet, Robert
2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 2510 - 2515
[27] DBA: Dynamic Multi-Armed Bandit Algorithm
Nobari, Sadegh
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9869 - 9870
[28] Percentile optimization in multi-armed bandit problems
Ghatrani, Zahra
Ghate, Archis
ANNALS OF OPERATIONS RESEARCH, 2024, 340 (2-3) : 837 - 862
[29] Multi-armed Bandit Mechanism with Private Histories
Liu, Chang
Cai, Qingpeng
Zhang, Yukui
AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 1607 - 1609
[30] Ambiguity aversion in multi-armed bandit problems
Anderson, Christopher M.
THEORY AND DECISION, 2012, 72 (01) : 15 - 33

← 1 2 3 4 5 →