Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring

被引:0
|
作者
Wan, Runzhe [1 ]
Liu, Yu [1 ]
McQueen, James [1 ]
Hains, Doug [1 ]
Song, Rui [1 ]
机构
[1] Amazon, Seattle, WA 98109 USA
关键词
Sequential Decision Making; A/B testing; Reinforcement learning;
D O I
10.1145/3580305.3599818
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the growing needs of online A/B testing to support the innovation in industry, the opportunity cost of running an experiment becomes non-negligible. Therefore, there is an increasing demand for an efficient continuous monitoring service that allows early stopping when appropriate. Classic statistical methods focus on hypothesis testing and are mostly developed for traditional highstake problems such as clinical trials, while experiments at online service companies typically have very different features and focuses. Motivated by the real needs, in this paper, we introduce a novel framework that we developed in Amazon to maximize customer experience and control opportunity cost. We formulate the problem as a Bayesian optimal sequential decision making problem that has a unified utility function. We discuss extensively practical design choices and considerations. We further introduce how to solve the optimal decision rule via Reinforcement Learning and scale the solution. We show the effectiveness of this novel approach compared with existing methods via a large-scale meta-analysis on experiments in Amazon.
引用
收藏
页码:5016 / 5027
页数:12
相关论文
共 50 条
  • [21] Quantum reinforcement learning during human decision-making
    Ji-An Li
    Daoyi Dong
    Zhengde Wei
    Ying Liu
    Yu Pan
    Franco Nori
    Xiaochu Zhang
    Nature Human Behaviour, 2020, 4 : 294 - 307
  • [22] Quantum reinforcement learning during human decision-making
    Li, Ji-An
    Dong, Daoyi
    Wei, Zhengde
    Liu, Ying
    Pan, Yu
    Nori, Franco
    Zhang, Xiaochu
    NATURE HUMAN BEHAVIOUR, 2020, 4 (03) : 294 - 307
  • [23] Reinforcement learning for decision-making under deep uncertainty
    Pei, Zhihao
    Rojas-Arevalo, Angela M.
    de Haan, Fjalar J.
    Lipovetzky, Nir
    Moallemi, Enayat A.
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2024, 359
  • [24] Application of Reinforcement Learning in Multiagent Intelligent Decision-Making
    Han, Xiaoyu
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [25] Learning Structural Weight Uncertainty for Sequential Decision-Making
    Zhang, Ruiyi
    Li, Chunyuan
    Chen, Changyou
    Carin, Lawrence
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [26] Organizational decision-making and the returns to experimentation
    Hall, Todd A.
    Hasan, Sharique
    JOURNAL OF ORGANIZATION DESIGN, 2022, 11 (04) : 129 - 144
  • [27] Organizational decision-making and the returns to experimentation
    Todd A. Hall
    Sharique Hasan
    Journal of Organization Design, 2022, 11 : 129 - 144
  • [28] SEQUENTIAL BAYESIAN LEARNING IN LINEAR NETWORKS WITH RANDOM DECISION MAKING
    Wang, Yunlong
    Djuric, Petar M.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [29] A Dual Decision-Making Continuous Reinforcement Learning Method Based on Sim2Real
    Xiao, Wenwen
    Wang, Xinzhi
    Luo, Xiangfeng
    Xie, Shaorong
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2024, 34 (03) : 467 - 488
  • [30] Cost-Aware Bayesian Sequential Decision-Making for Search and Classification
    Wang, Y.
    Hussein, I. I.
    Brown, D. R., III
    Erwin, R. S.
    IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2012, 48 (03) : 2566 - 2581