The Multi-fidelity Multi-armed Bandit

被引：0

作者：

Kandasamy, Kirthevasan ^{[1
]}

Dasarathy, Gautam ^{[2
]}

Schneider, Jeff ^{[1
]}

Poczos, Barnabas ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[2] Rice Univ, Houston, TX 77251 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016) | 2016年 / 29卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study a variant of the classical stochastic K-armed bandit where observing the outcome of each arm is expensive, but cheap approximations to this outcome are available. For example, in online advertising the performance of an ad can be approximated by displaying it for shorter time periods or to narrower audiences. We formalise this task as a multi fidelity bandit, where, at each time step, the forecaster may choose to play an arm at any one of M fidelities. The highest fidelity (desired outcome) expends cost lambda((M)). The mth fidelity (an approximation) expends lambda((M)) < lambda((M)) and returns a biased estimate of the highest fidelity. We develop MF-UCB, a novel upper confidence bound procedure for this setting and prove that it naturally adapts to the sequence of available approximations and costs thus attaining better regret than naive strategies which ignore the approximations. For instance, in the above online advertising example, MF-UCB would use the lower fidelities to quickly eliminate suboptimal ads and reserve the larger expensive experiments on a small set of promising candidates. We complement this result with a lower bound and show that MF-UCB is nearly optimal under certain conditions.

引用

页数：9

共 50 条

[21] Robust control of the multi-armed bandit problem
Caro, Felipe
Das Gupta, Aparupa
ANNALS OF OPERATIONS RESEARCH, 2022, 317 (02) : 461 - 480
[22] Anytime Algorithms for Multi-Armed Bandit Problems
Kleinberg, Robert
PROCEEDINGS OF THE SEVENTHEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2006, : 928 - 936
[23] Achieving Privacy in the Adversarial Multi-Armed Bandit
Tossou, Aristide C. Y.
Dimitrakakis, Christos
THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2653 - 2659
[24] Generic Outlier Detection in Multi-Armed Bandit
Ban, Yikun
He, Jingrui
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 913 - 923
[25] A modern Bayesian look at the multi-armed bandit
Scott, Steven L.
APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2010, 26 (06) : 639 - 658
[26] Percentile optimization in multi-armed bandit problems
Ghatrani, Zahra
Ghate, Archis
ANNALS OF OPERATIONS RESEARCH, 2024, 340 (2-3) : 837 - 862
[27] A Multi-Armed Bandit Strategy for Countermeasure Selection
Cochrane, Madeleine
Hunjet, Robert
2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 2510 - 2515
[28] DBA: Dynamic Multi-Armed Bandit Algorithm
Nobari, Sadegh
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9869 - 9870
[29] Multi-armed Bandit Mechanism with Private Histories
Liu, Chang
Cai, Qingpeng
Zhang, Yukui
AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 1607 - 1609
[30] An Adaptive Algorithm in Multi-Armed Bandit Problem
Zhang X.
Zhou Q.
Liang B.
Xu J.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2019, 56 (03): : 643 - 654

← 1 2 3 4 5 →