False Discovery in A/B Testing

被引：8

作者：

Berman, Ron ^{[1
]}

Van den Bulte, Christophe ^{[1
]}

机构：

[1] Univ Penn, Wharton Sch, Marketing, Philadelphia, PA 19104 USA

来源：

MANAGEMENT SCIENCE | 2022年 / 68卷 / 09期

关键词：

statistics; design of experiments; decision analysis; inference; A/B testing; false discovery rate; STATISTICAL SIGNIFICANCE; POWER CALCULATIONS; DESIGN;

D O I：

10.1287/mnsc.2021.4207

中图分类号：

C93 [管理学];

学科分类号：

12 ; 1201 ; 1202 ; 120202 ;

摘要：

We investigate what fraction of all significant results in website A/B testing is actually null effects (i.e., the false discovery rate (FDR)). Our data consist of 4,964 effects from 2,766 experiments conducted on a commercial A/B testing platform. Using three different methods, we find that the FDR ranges between 28% and 37% for tests conducted at 10% significance and between 18% and 25% for tests at 5% significance (two sided). These high FDRs stem mostly from the high fraction of true null effects, about 70%, rather than from low power. Using our estimates, we also assess the potential of various A/B test designs to reduce the FDR. The twomain implications are that decisionmakers should expect one in five interventions achieving significance at 5% confidence to be ineffective when deployed in the field and that analysts should consider using two-stage designs with multiple variations rather than basic A/B tests.

引用

页码：6762 / 6782

页数：21

共 50 条

[21] FALSE POSITIVE AND FALSE NEGATIVE REACTIONS IN HLA-B-27 ANTIGEN TESTING
LARSEN, AE
TRANSFUSION, 1979, 19 (02) : 219 - 221
[22] ONLINE RULES FOR CONTROL OF FALSE DISCOVERY RATE AND FALSE DISCOVERY EXCEEDANCE
Javanmard, Adel
Montanari, Andrea
ANNALS OF STATISTICS, 2018, 46 (02): : 526 - 554
[23] Screening-Assisted Dynamic Multiple Testing with False Discovery Rate Control
MUSHTAQ Iram
ZHOU Qin
ZI Xuemin
Journal of Systems Science & Complexity, 2023, 36 (02) : 716 - 754
[24] Structure-Adaptive Sequential Testing for Online False Discovery Rate Control
Gang, Bowen
Sun, Wenguang
Wang, Weinan
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (541) : 732 - 745
[25] Power and stability comparisons of multiple testing procedures with false discovery rate control
Li, Dongmei
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2015, 85 (14) : 2808 - 2822
[26] QuTE: decentralized multiple testing on sensor networks with false discovery rate control
Ramdas, Aaditya
Chen, Jianbo
Wainwright, Martin J.
Jordan, Michael I.
2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
[27] Screening-Assisted Dynamic Multiple Testing with False Discovery Rate Control
Mushtaq, Iram
Zhou, Qin
Zi, Xuemin
JOURNAL OF SYSTEMS SCIENCE & COMPLEXITY, 2023, 36 (02) : 716 - 754
[28] Screening-Assisted Dynamic Multiple Testing with False Discovery Rate Control
Iram Mushtaq
Qin Zhou
Xuemin Zi
Journal of Systems Science and Complexity, 2023, 36 : 716 - 754
[29] Normalization, testing, and false discovery rate estimation for RNA-sequencing data
Li, Jun
Witten, Daniela M.
Johnstone, Iain M.
Tibshirani, Robert
BIOSTATISTICS, 2012, 13 (03) : 523 - 538
[30] A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion
Farcomeni, Alessio
STATISTICAL METHODS IN MEDICAL RESEARCH, 2008, 17 (04) : 347 - 388

← 1 2 3 4 5 →