A/B testing: A systematic literature review

被引:4
|
作者
Quin, Federico [1 ]
Weyns, Danny [1 ,2 ]
Galster, Matthias [3 ]
Silva, Camila Costa [3 ]
机构
[1] Distrinet, KU Leuven, Celestijnenlaan 200A, B-3000 Leuven, Belgium
[2] Linnaeus Univ, Univ Platsen 1, S-35252 Vaxjo, Sweden
[3] Univ Canterbury, 69 Creyke Rd, Christchurch 8140, New Zealand
关键词
A/B testing; systematic literature review; A/B test engineering; SOFTWARE; EXPERIMENTATION; KNOWLEDGE; PRODUCT; MODELS;
D O I
10.1016/j.jss.2024.112011
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A/B testing, also referred to as online controlled experimentation or continuous experimentation, is a form of hypothesis testing where two variants of a piece of software are compared in the field from an end user's point of view. A/B testing is widely used in practice to enable data-driven decision making for software development. While a few studies have explored different facets of research on A/B testing, no comprehensive study has been conducted on the state-of-the-art in A/B testing. Such a study is crucial to provide a systematic overview of the field of A/B testing driving future research forward. To address this gap and provide an overview of the state-of-the-art in A/B testing, this paper reports the results of a systematic literature review that analyzed primary studies. The research questions focused on the subject of A/B testing, how A/B tests are designed and executed, what roles stakeholders have in this process, and the open challenges in the area. Analysis of the extracted data shows that the main targets of A/B testing are algorithms, visual elements, and workflow and processes. Single classic A/B tests are the dominating type of tests, primarily based in hypothesis tests. Stakeholders have three main roles in the design of A/B tests: concept designer, experiment architect, and setup technician. The primary types of data collected during the execution of A/B tests are product/system data, user-centric data, and spatio-temporal data. The dominating use of the test results are feature selection, feature rollout, continued feature development, and subsequent A/B test design. Stakeholders have two main roles during A/B test execution: experiment coordinator and experiment assessor. The main reported open problems are related to the enhancement of proposed approaches and their usability. From our study we derived three interesting lines for future research: strengthen the adoption of statistical methods in A/B testing, improving the process of A/B testing, and enhancing the automation of A/B testing.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] A systematic literature review of literature reviews in software testing
    Garousi, Vahid
    Mantyla, Mika V.
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2016, 80 : 195 - 216
  • [2] Microservice Testing Approaches: A Systematic Literature Review
    Ghani, Israr
    Wan-Kadir, Wan M. N.
    Mustafa, Ahmad
    Babir, Muhammad Imran
    [J]. INTERNATIONAL JOURNAL OF INTEGRATED ENGINEERING, 2019, 11 (08): : 65 - 80
  • [3] Web application testing: A systematic literature review
    Dogan, Serdar
    Betin-Can, Aysu
    Garousi, Vahid
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2014, 91 : 174 - 201
  • [4] Crowdsourced software testing: A systematic literature review
    Alyahya, Sultan
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2020, 127
  • [5] Regression Testing - A Protocol for Systematic Literature Review
    Ba-Quttayyan, Bakr
    Mohd, Haslina
    Baharom, Fauziah
    [J]. PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND TECHNOLOGY (ICAST'18), 2018, 2016
  • [6] Testing scientific software: A systematic literature review
    Kanewala, Upulee
    Bieman, James M.
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2014, 56 (10) : 1219 - 1232
  • [7] Ovarian reserve testing: systematic review of the literature
    Gupta, Sajal
    Sharma, Dipika
    Surti, Nilopher
    Kesavan, Shubhangi
    Khanna, Pallavi
    Agarwal, Ashok
    [J]. ARCHIVES OF MEDICAL SCIENCE, 2009, 5 (1A) : S143 - S150
  • [8] How is genetic testing evaluated? A systematic review of the literature
    Erica Pitini
    Corrado De Vito
    Carolina Marzuillo
    Elvira D’Andrea
    Annalisa Rosso
    Antonio Federici
    Emilio Di Maria
    Paolo Villari
    [J]. European Journal of Human Genetics, 2018, 26 : 605 - 615
  • [9] Personal utility in genomic testing: a systematic literature review
    Jennefer N Kohler
    Erin Turbitt
    Barbara B Biesecker
    [J]. European Journal of Human Genetics, 2017, 25 : 662 - 668
  • [10] Web Service Testing Techniques: A Systematic Literature Review
    Ghani, Israr
    Wan-Kadir, Wan M. N.
    Mustafa, Ahmad
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (08) : 443 - 458