The Multi-fidelity Multi-armed Bandit

被引：0

作者：

Kandasamy, Kirthevasan ^{[1
]}

Dasarathy, Gautam ^{[2
]}

Schneider, Jeff ^{[1
]}

Poczos, Barnabas ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[2] Rice Univ, Houston, TX 77251 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016) | 2016年 / 29卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study a variant of the classical stochastic K-armed bandit where observing the outcome of each arm is expensive, but cheap approximations to this outcome are available. For example, in online advertising the performance of an ad can be approximated by displaying it for shorter time periods or to narrower audiences. We formalise this task as a multi fidelity bandit, where, at each time step, the forecaster may choose to play an arm at any one of M fidelities. The highest fidelity (desired outcome) expends cost lambda((M)). The mth fidelity (an approximation) expends lambda((M)) < lambda((M)) and returns a biased estimate of the highest fidelity. We develop MF-UCB, a novel upper confidence bound procedure for this setting and prove that it naturally adapts to the sequence of available approximations and costs thus attaining better regret than naive strategies which ignore the approximations. For instance, in the above online advertising example, MF-UCB would use the lower fidelities to quickly eliminate suboptimal ads and reserve the larger expensive experiments on a small set of promising candidates. We complement this result with a lower bound and show that MF-UCB is nearly optimal under certain conditions.

引用

页数：9

共 50 条

[11] Multi-armed Bandit with Additional Observations
Yun, Donggyu
Proutiere, Alexandre
Ahn, Sumyeong
Shin, Jinwoo
Yi, Yung
PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2018, 2 (01)
[12] IMPROVING STRATEGIES FOR THE MULTI-ARMED BANDIT
POHLENZ, S
MARKOV PROCESS AND CONTROL THEORY, 1989, 54 : 158 - 163
[13] MULTI-ARMED BANDIT ALLOCATION INDEXES
JONES, PW
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 1989, 40 (12) : 1158 - 1159
[14] THE MULTI-ARMED BANDIT PROBLEM WITH COVARIATES
Perchet, Vianney
Rigollet, Philippe
ANNALS OF STATISTICS, 2013, 41 (02): : 693 - 721
[15] Multi-armed Bandit with Additional Observations
Yun D.
Ahn S.
Proutiere A.
Shin J.
Yi Y.
2018, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (46): : 53 - 55
[16] ON MULTI-ARMED BANDIT PROBLEM WITH NUISANCE PARAMETER
孙嘉阳
Science China Mathematics, 1986, (05) : 464 - 475
[17] Multi-armed bandit algorithms and empirical evaluation
Vermorel, J
Mohri, M
MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 437 - 448
[18] Sustainable Cooperative Coevolution with a Multi-Armed Bandit
De Rainville, Francois-Michel
Sebag, Michele
Gagne, Christian
Schoenauer, Marc
Laurendeau, Denis
GECCO'13: PROCEEDINGS OF THE 2013 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2013, : 1517 - 1524
[19] Identifying Outlier Arms in Multi-Armed Bandit
Zhuang, Honglei
Wang, Chi
Wang, Yifan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[20] Characterizing Truthful Multi-Armed Bandit Mechanisms
Babaioff, Moshe
Sharma, Yogeshwer
Slivkins, Aleksandrs
10TH ACM CONFERENCE ON ELECTRONIC COMMERCE - EC 2009, 2009, : 79 - 88

← 1 2 3 4 5 →