Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring

被引：0

作者：

Wan, Runzhe ^{[1
]}

Liu, Yu ^{[1
]}

McQueen, James ^{[1
]}

Hains, Doug ^{[1
]}

Song, Rui ^{[1
]}

机构：

[1] Amazon, Seattle, WA 98109 USA

来源：

PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年

关键词：

Sequential Decision Making; A/B testing; Reinforcement learning;

D O I：

10.1145/3580305.3599818

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the growing needs of online A/B testing to support the innovation in industry, the opportunity cost of running an experiment becomes non-negligible. Therefore, there is an increasing demand for an efficient continuous monitoring service that allows early stopping when appropriate. Classic statistical methods focus on hypothesis testing and are mostly developed for traditional highstake problems such as clinical trials, while experiments at online service companies typically have very different features and focuses. Motivated by the real needs, in this paper, we introduce a novel framework that we developed in Amazon to maximize customer experience and control opportunity cost. We formulate the problem as a Bayesian optimal sequential decision making problem that has a unified utility function. We discuss extensively practical design choices and considerations. We further introduce how to solve the optimal decision rule via Reinforcement Learning and scale the solution. We show the effectiveness of this novel approach compared with existing methods via a large-scale meta-analysis on experiments in Amazon.

引用

页码：5016 / 5027

页数：12

共 50 条

[21] Quantum reinforcement learning during human decision-making
Ji-An Li
Daoyi Dong
Zhengde Wei
Ying Liu
Yu Pan
Franco Nori
Xiaochu Zhang
Nature Human Behaviour, 2020, 4 : 294 - 307
[22] Quantum reinforcement learning during human decision-making
Li, Ji-An
Dong, Daoyi
Wei, Zhengde
Liu, Ying
Pan, Yu
Nori, Franco
Zhang, Xiaochu
NATURE HUMAN BEHAVIOUR, 2020, 4 (03) : 294 - 307
[23] Reinforcement learning for decision-making under deep uncertainty
Pei, Zhihao
Rojas-Arevalo, Angela M.
de Haan, Fjalar J.
Lipovetzky, Nir
Moallemi, Enayat A.
JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2024, 359
[24] Application of Reinforcement Learning in Multiagent Intelligent Decision-Making
Han, Xiaoyu
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
[25] Learning Structural Weight Uncertainty for Sequential Decision-Making
Zhang, Ruiyi
Li, Chunyuan
Chen, Changyou
Carin, Lawrence
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[26] Organizational decision-making and the returns to experimentation
Hall, Todd A.
Hasan, Sharique
JOURNAL OF ORGANIZATION DESIGN, 2022, 11 (04) : 129 - 144
[27] Organizational decision-making and the returns to experimentation
Todd A. Hall
Sharique Hasan
Journal of Organization Design, 2022, 11 : 129 - 144
[28] SEQUENTIAL BAYESIAN LEARNING IN LINEAR NETWORKS WITH RANDOM DECISION MAKING
Wang, Yunlong
Djuric, Petar M.
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[29] A Dual Decision-Making Continuous Reinforcement Learning Method Based on Sim2Real
Xiao, Wenwen
Wang, Xinzhi
Luo, Xiangfeng
Xie, Shaorong
INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2024, 34 (03) : 467 - 488
[30] Cost-Aware Bayesian Sequential Decision-Making for Search and Classification
Wang, Y.
Hussein, I. I.
Brown, D. R., III
Erwin, R. S.
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2012, 48 (03) : 2566 - 2581

← 1 2 3 4 5 →