Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring

被引:0
|
作者
Wan, Runzhe [1 ]
Liu, Yu [1 ]
McQueen, James [1 ]
Hains, Doug [1 ]
Song, Rui [1 ]
机构
[1] Amazon, Seattle, WA 98109 USA
关键词
Sequential Decision Making; A/B testing; Reinforcement learning;
D O I
10.1145/3580305.3599818
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the growing needs of online A/B testing to support the innovation in industry, the opportunity cost of running an experiment becomes non-negligible. Therefore, there is an increasing demand for an efficient continuous monitoring service that allows early stopping when appropriate. Classic statistical methods focus on hypothesis testing and are mostly developed for traditional highstake problems such as clinical trials, while experiments at online service companies typically have very different features and focuses. Motivated by the real needs, in this paper, we introduce a novel framework that we developed in Amazon to maximize customer experience and control opportunity cost. We formulate the problem as a Bayesian optimal sequential decision making problem that has a unified utility function. We discuss extensively practical design choices and considerations. We further introduce how to solve the optimal decision rule via Reinforcement Learning and scale the solution. We show the effectiveness of this novel approach compared with existing methods via a large-scale meta-analysis on experiments in Amazon.
引用
收藏
页码:5016 / 5027
页数:12
相关论文
共 50 条
  • [41] A Multiple-Attribute Decision-Making Approach to Reinforcement Learning
    Shi, Haobin
    Xu, Meng
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2020, 12 (04) : 695 - 708
  • [42] Historical Decision-Making Regularized Maximum Entropy Reinforcement Learning
    Dong, Botao
    Huang, Longyang
    Pang, Ning
    Chen, Hongtian
    Zhang, Weidong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [43] Research on Decision-Making in Emotional Agent Based on Reinforcement Learning
    Feng Chao
    Chen Lin
    Jiang Kui
    Wei Zhonglin
    Zhai Bing
    2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 1191 - 1194
  • [44] SPACECRAFT DECISION-MAKING AUTONOMY USING DEEP REINFORCEMENT LEARNING
    Harris, Andrew
    Teil, Thibaud
    Schaub, Hanspeter
    SPACEFLIGHT MECHANICS 2019, VOL 168, PTS I-IV, 2019, 168 : 1757 - 1775
  • [45] Reinforcement learning applied to a situation awareness decision-making model
    Costa, Renato D.
    Hirata, Celso M.
    INFORMATION SCIENCES, 2025, 704
  • [46] BAYESIAN MODEL OF DECISION-MAKING AS A RESULT OF LEARNING FROM EXPERIENCE
    SHUBERT, BO
    ANNALS OF MATHEMATICAL STATISTICS, 1969, 40 (06): : 2127 - &
  • [47] Meta-Learning Hypothesis Spaces for Sequential Decision-making
    Kassraie, Parnian
    Rothfuss, Jonas
    Krause, Andreas
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 10802 - 10824
  • [48] Dyna-PPO reinforcement learning with Gaussian process for the continuous action decision-making in autonomous driving
    Wu, Guanlin
    Fang, Wenqi
    Wang, Ji
    Ge, Pin
    Cao, Jiang
    Ping, Yang
    Gou, Peng
    APPLIED INTELLIGENCE, 2023, 53 (13) : 16893 - 16907
  • [49] Dyna-PPO reinforcement learning with Gaussian process for the continuous action decision-making in autonomous driving
    Guanlin Wu
    Wenqi Fang
    Ji Wang
    Pin Ge
    Jiang Cao
    Yang Ping
    Peng Gou
    Applied Intelligence, 2023, 53 : 16893 - 16907
  • [50] Bayesian Sequential Learning and Decision Making in Bike-Sharing Systems
    Aktekin, Tevfik
    Kim, Bumsoo
    Novoa, Luis J.
    Zafari, Babak
    APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2024, 40 (06) : 1675 - 1688