Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring

被引：0

作者：

Wan, Runzhe ^{[1
]}

Liu, Yu ^{[1
]}

McQueen, James ^{[1
]}

Hains, Doug ^{[1
]}

Song, Rui ^{[1
]}

机构：

[1] Amazon, Seattle, WA 98109 USA

来源：

PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年

关键词：

Sequential Decision Making; A/B testing; Reinforcement learning;

D O I：

10.1145/3580305.3599818

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the growing needs of online A/B testing to support the innovation in industry, the opportunity cost of running an experiment becomes non-negligible. Therefore, there is an increasing demand for an efficient continuous monitoring service that allows early stopping when appropriate. Classic statistical methods focus on hypothesis testing and are mostly developed for traditional highstake problems such as clinical trials, while experiments at online service companies typically have very different features and focuses. Motivated by the real needs, in this paper, we introduce a novel framework that we developed in Amazon to maximize customer experience and control opportunity cost. We formulate the problem as a Bayesian optimal sequential decision making problem that has a unified utility function. We discuss extensively practical design choices and considerations. We further introduce how to solve the optimal decision rule via Reinforcement Learning and scale the solution. We show the effectiveness of this novel approach compared with existing methods via a large-scale meta-analysis on experiments in Amazon.

引用

页码：5016 / 5027

页数：12

共 50 条

[41] A Multiple-Attribute Decision-Making Approach to Reinforcement Learning
Shi, Haobin
Xu, Meng
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2020, 12 (04) : 695 - 708
[42] Historical Decision-Making Regularized Maximum Entropy Reinforcement Learning
Dong, Botao
Huang, Longyang
Pang, Ning
Chen, Hongtian
Zhang, Weidong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[43] Research on Decision-Making in Emotional Agent Based on Reinforcement Learning
Feng Chao
Chen Lin
Jiang Kui
Wei Zhonglin
Zhai Bing
2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 1191 - 1194
[44] SPACECRAFT DECISION-MAKING AUTONOMY USING DEEP REINFORCEMENT LEARNING
Harris, Andrew
Teil, Thibaud
Schaub, Hanspeter
SPACEFLIGHT MECHANICS 2019, VOL 168, PTS I-IV, 2019, 168 : 1757 - 1775
[45] Reinforcement learning applied to a situation awareness decision-making model
Costa, Renato D.
Hirata, Celso M.
INFORMATION SCIENCES, 2025, 704
[46] BAYESIAN MODEL OF DECISION-MAKING AS A RESULT OF LEARNING FROM EXPERIENCE
SHUBERT, BO
ANNALS OF MATHEMATICAL STATISTICS, 1969, 40 (06): : 2127 - &
[47] Meta-Learning Hypothesis Spaces for Sequential Decision-making
Kassraie, Parnian
Rothfuss, Jonas
Krause, Andreas
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 10802 - 10824
[48] Dyna-PPO reinforcement learning with Gaussian process for the continuous action decision-making in autonomous driving
Wu, Guanlin
Fang, Wenqi
Wang, Ji
Ge, Pin
Cao, Jiang
Ping, Yang
Gou, Peng
APPLIED INTELLIGENCE, 2023, 53 (13) : 16893 - 16907
[49] Dyna-PPO reinforcement learning with Gaussian process for the continuous action decision-making in autonomous driving
Guanlin Wu
Wenqi Fang
Ji Wang
Pin Ge
Jiang Cao
Yang Ping
Peng Gou
Applied Intelligence, 2023, 53 : 16893 - 16907
[50] Bayesian Sequential Learning and Decision Making in Bike-Sharing Systems
Aktekin, Tevfik
Kim, Bumsoo
Novoa, Luis J.
Zafari, Babak
APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, 2024, 40 (06) : 1675 - 1688

← 1 2 3 4 5 →