Bayesian Optimization for Policy Search via Online-Offline Experimentation

被引:0
|
作者
Letham, Benjamin [1 ]
Bakshy, Eytan [1 ]
机构
[1] Facebook, Menlo Pk, CA 94025 USA
关键词
Bayesian optimization; multi-task Gaussian process; policy search; A/B testing; multi-fidelity optimization; MULTIVARIATE; ALGORITHM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Online field experiments are the gold-standard way of evaluating changes to real-world interactive machine learning systems. Yet our ability to explore complex, multi-dimensional policy spaces-such as those found in recommendation and ranking problems-is often constrained by the limited number of experiments that can be run simultaneously. To alleviate these constraints, we augment online experiments with an offline simulator and apply multi-task Bayesian optimization to tune live machine learning systems. We describe practical issues that arise in these types of applications, including biases that arise from using a simulator and assumptions for the multi-task kernel. We measure empirical learning curves which show substantial gains from including data from biased offline experiments, and show how these learning curves are consistent with theoretical results for multi-task Gaussian process generalization. We find that improved kernel inference is a significant driver of multi-task generalization. Finally, we show several examples of Bayesian optimization efficiently tuning a live machine learning system by combining offline and online experiments.
引用
收藏
页数:30
相关论文
共 50 条
  • [1] Bayesian optimization for policy search via online-offline experimentation
    Letham, Benjamin
    Bakshy, Eytan
    Journal of Machine Learning Research, 2019, 20
  • [2] ONLINE-OFFLINE COM
    FURLONG, JD
    JOURNAL OF MICROGRAPHICS, 1981, 14 (12): : 38 - 39
  • [3] Integrated online-offline methods for audio segmentation
    Thornburg, HD
    Smith, JO
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 4026 - 4026
  • [4] Price discrimination under online-offline competition
    Guo, Wen-Chung
    Lai, Fu-Chuan
    ECONOMICS LETTERS, 2022, 216
  • [5] Young children's online-offline balance
    Radesky, Jenny S.
    ACTA PAEDIATRICA, 2021, 110 (03) : 748 - 749
  • [6] A Study on the Online-Offline and Blended Learning Methods
    Sharma D.
    Sood A.K.
    Darius P.S.H.
    Gundabattini E.
    Darius Gnanaraj S.
    Joseph Jeyapaul A.
    Journal of The Institution of Engineers (India): Series B, 2022, 103 (04) : 1373 - 1382
  • [7] Chinese Trans Women in Japan and Their Embodied Search for Gender Identity in the Online-Offline Continuum
    Wang, Xinyu Promio
    TRANSFERS-INTERDISCIPLINARY JOURNAL OF MOBILITY STUDIES, 2022, 12 (03) : 28 - 46
  • [8] ALICE : online-offline processing for Run 3
    Rohr, David
    EIGHTH ANNUAL CONFERENCE ON LARGE HADRON COLLIDER PHYSICS, LHCP2020, 2021,
  • [9] Rethinking the online-offline connection in the study of religion online INTRODUCTION
    Campbell, Heidi A.
    Loevheim, Mia
    INFORMATION COMMUNICATION & SOCIETY, 2011, 14 (08) : 1083 - 1096
  • [10] An Online-Offline Combined Big Data Mining Platform
    Zhang, Weishan
    Lv, Hao
    Xu, Liang
    Liu, Yan
    Liu, Xin
    Lu, Qinghua
    Li, Zhongwei
    Zhou, Jiehan
    2017 IEEE 15TH INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, 15TH INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, 3RD INTL CONF ON BIG DATA INTELLIGENCE AND COMPUTING AND CYBER SCIENCE AND TECHNOLOGY CONGRESS(DASC/PICOM/DATACOM/CYBERSCI, 2017, : 1220 - 1225