False Discovery in A/B Testing

被引:8
|
作者
Berman, Ron [1 ]
Van den Bulte, Christophe [1 ]
机构
[1] Univ Penn, Wharton Sch, Marketing, Philadelphia, PA 19104 USA
关键词
statistics; design of experiments; decision analysis; inference; A/B testing; false discovery rate; STATISTICAL SIGNIFICANCE; POWER CALCULATIONS; DESIGN;
D O I
10.1287/mnsc.2021.4207
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
We investigate what fraction of all significant results in website A/B testing is actually null effects (i.e., the false discovery rate (FDR)). Our data consist of 4,964 effects from 2,766 experiments conducted on a commercial A/B testing platform. Using three different methods, we find that the FDR ranges between 28% and 37% for tests conducted at 10% significance and between 18% and 25% for tests at 5% significance (two sided). These high FDRs stem mostly from the high fraction of true null effects, about 70%, rather than from low power. Using our estimates, we also assess the potential of various A/B test designs to reduce the FDR. The twomain implications are that decisionmakers should expect one in five interventions achieving significance at 5% confidence to be ineffective when deployed in the field and that analysts should consider using two-stage designs with multiple variations rather than basic A/B tests.
引用
收藏
页码:6762 / 6782
页数:21
相关论文
共 50 条
  • [1] False discovery rates and multiple testing
    Dey S.
    Delampady M.
    Resonance, 2013, 18 (12) : 1095 - 1109
  • [2] A TUTORIAL ON MULTIPLE TESTING: FALSE DISCOVERY CONTROL
    Chatelain, F.
    MATHEMATICAL TOOLS FOR INSTRUMENTATION & SIGNAL PROCESSING IN ASTRONOMY, 2016, 78-79 : 163 - 178
  • [3] A Bayesian false discovery rate for multiple testing
    Whittemore, Alice S.
    JOURNAL OF APPLIED STATISTICS, 2007, 34 (01) : 1 - 9
  • [4] The false discovery rate for multiple testing in factorial experiments
    Kimel, Maria Tripollski
    Benjamini, Yoav
    Steinberg, David M.
    TECHNOMETRICS, 2008, 50 (01) : 32 - 39
  • [5] A Bandit Approach to Multiple Testing with False Discovery Control
    Jamieson, Kevin
    Jain, Lalit
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [6] Testing Jumps via False Discovery Rate Control
    Yen, Yu-Min
    PLOS ONE, 2013, 8 (04):
  • [7] Efficient Stratified Testing Procedure for a False Discovery Rate
    Han, Seungbong
    Andrei, Adin-Cristian
    Tsui, Kam-Wah
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2015, 44 (05) : 1117 - 1125
  • [8] False discovery and false nondiscovery rates in single-step multiple testing procedures
    Sarkar, Sanat K.
    ANNALS OF STATISTICS, 2006, 34 (01): : 394 - 415
  • [9] The control of the false discovery rate in fixed sequence multiple testing
    Lynch, Gavin
    Guo, Wenge
    Sarkar, Sanat K.
    Finner, Helmut
    ELECTRONIC JOURNAL OF STATISTICS, 2017, 11 (02): : 4649 - 4673
  • [10] Generalized Augmentation to Control the False Discovery Exceedance in Multiple Testing
    Farcomeni, Alessio
    SCANDINAVIAN JOURNAL OF STATISTICS, 2009, 36 (03) : 501 - 517