The Multi-fidelity Multi-armed Bandit

被引:0
|
作者
Kandasamy, Kirthevasan [1 ]
Dasarathy, Gautam [2 ]
Schneider, Jeff [1 ]
Poczos, Barnabas [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Rice Univ, Houston, TX 77251 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study a variant of the classical stochastic K-armed bandit where observing the outcome of each arm is expensive, but cheap approximations to this outcome are available. For example, in online advertising the performance of an ad can be approximated by displaying it for shorter time periods or to narrower audiences. We formalise this task as a multi fidelity bandit, where, at each time step, the forecaster may choose to play an arm at any one of M fidelities. The highest fidelity (desired outcome) expends cost lambda((M)). The mth fidelity (an approximation) expends lambda((M)) < lambda((M)) and returns a biased estimate of the highest fidelity. We develop MF-UCB, a novel upper confidence bound procedure for this setting and prove that it naturally adapts to the sequence of available approximations and costs thus attaining better regret than naive strategies which ignore the approximations. For instance, in the above online advertising example, MF-UCB would use the lower fidelities to quickly eliminate suboptimal ads and reserve the larger expensive experiments on a small set of promising candidates. We complement this result with a lower bound and show that MF-UCB is nearly optimal under certain conditions.
引用
收藏
页数:9
相关论文
共 50 条
  • [11] Multi-armed Bandit with Additional Observations
    Yun, Donggyu
    Proutiere, Alexandre
    Ahn, Sumyeong
    Shin, Jinwoo
    Yi, Yung
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2018, 2 (01)
  • [12] IMPROVING STRATEGIES FOR THE MULTI-ARMED BANDIT
    POHLENZ, S
    MARKOV PROCESS AND CONTROL THEORY, 1989, 54 : 158 - 163
  • [13] MULTI-ARMED BANDIT ALLOCATION INDEXES
    JONES, PW
    JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 1989, 40 (12) : 1158 - 1159
  • [14] THE MULTI-ARMED BANDIT PROBLEM WITH COVARIATES
    Perchet, Vianney
    Rigollet, Philippe
    ANNALS OF STATISTICS, 2013, 41 (02): : 693 - 721
  • [15] Multi-armed Bandit with Additional Observations
    Yun D.
    Ahn S.
    Proutiere A.
    Shin J.
    Yi Y.
    2018, Association for Computing Machinery, 2 Penn Plaza, Suite 701, New York, NY 10121-0701, United States (46): : 53 - 55
  • [16] ON MULTI-ARMED BANDIT PROBLEM WITH NUISANCE PARAMETER
    孙嘉阳
    Science China Mathematics, 1986, (05) : 464 - 475
  • [17] Multi-armed bandit algorithms and empirical evaluation
    Vermorel, J
    Mohri, M
    MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 437 - 448
  • [18] Sustainable Cooperative Coevolution with a Multi-Armed Bandit
    De Rainville, Francois-Michel
    Sebag, Michele
    Gagne, Christian
    Schoenauer, Marc
    Laurendeau, Denis
    GECCO'13: PROCEEDINGS OF THE 2013 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2013, : 1517 - 1524
  • [19] Identifying Outlier Arms in Multi-Armed Bandit
    Zhuang, Honglei
    Wang, Chi
    Wang, Yifan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [20] Characterizing Truthful Multi-Armed Bandit Mechanisms
    Babaioff, Moshe
    Sharma, Yogeshwer
    Slivkins, Aleksandrs
    10TH ACM CONFERENCE ON ELECTRONIC COMMERCE - EC 2009, 2009, : 79 - 88