A/B testing: A systematic literature review

被引：4

作者：

Quin, Federico ^{[1
]}

Weyns, Danny ^{[1
,2
]}

Galster, Matthias ^{[3
]}

Silva, Camila Costa ^{[3
]}

机构：

[1] Distrinet, KU Leuven, Celestijnenlaan 200A, B-3000 Leuven, Belgium

[2] Linnaeus Univ, Univ Platsen 1, S-35252 Vaxjo, Sweden

[3] Univ Canterbury, 69 Creyke Rd, Christchurch 8140, New Zealand

来源：

JOURNAL OF SYSTEMS AND SOFTWARE | 2024年 / 211卷

关键词：

A/B testing; systematic literature review; A/B test engineering; SOFTWARE; EXPERIMENTATION; KNOWLEDGE; PRODUCT; MODELS;

D O I：

10.1016/j.jss.2024.112011

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

A/B testing, also referred to as online controlled experimentation or continuous experimentation, is a form of hypothesis testing where two variants of a piece of software are compared in the field from an end user's point of view. A/B testing is widely used in practice to enable data-driven decision making for software development. While a few studies have explored different facets of research on A/B testing, no comprehensive study has been conducted on the state-of-the-art in A/B testing. Such a study is crucial to provide a systematic overview of the field of A/B testing driving future research forward. To address this gap and provide an overview of the state-of-the-art in A/B testing, this paper reports the results of a systematic literature review that analyzed primary studies. The research questions focused on the subject of A/B testing, how A/B tests are designed and executed, what roles stakeholders have in this process, and the open challenges in the area. Analysis of the extracted data shows that the main targets of A/B testing are algorithms, visual elements, and workflow and processes. Single classic A/B tests are the dominating type of tests, primarily based in hypothesis tests. Stakeholders have three main roles in the design of A/B tests: concept designer, experiment architect, and setup technician. The primary types of data collected during the execution of A/B tests are product/system data, user-centric data, and spatio-temporal data. The dominating use of the test results are feature selection, feature rollout, continued feature development, and subsequent A/B test design. Stakeholders have two main roles during A/B test execution: experiment coordinator and experiment assessor. The main reported open problems are related to the enhancement of proposed approaches and their usability. From our study we derived three interesting lines for future research: strengthen the adoption of statistical methods in A/B testing, improving the process of A/B testing, and enhancing the automation of A/B testing.

引用

页数：28

共 50 条

[1] A systematic literature review of literature reviews in software testing
Garousi, Vahid
Mantyla, Mika V.
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2016, 80 : 195 - 216
[2] Microservice Testing Approaches: A Systematic Literature Review
Ghani, Israr
Wan-Kadir, Wan M. N.
Mustafa, Ahmad
Babir, Muhammad Imran
[J]. INTERNATIONAL JOURNAL OF INTEGRATED ENGINEERING, 2019, 11 (08): : 65 - 80
[3] Web application testing: A systematic literature review
Dogan, Serdar
Betin-Can, Aysu
Garousi, Vahid
[J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2014, 91 : 174 - 201
[4] Crowdsourced software testing: A systematic literature review
Alyahya, Sultan
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2020, 127
[5] Regression Testing - A Protocol for Systematic Literature Review
Ba-Quttayyan, Bakr
Mohd, Haslina
Baharom, Fauziah
[J]. PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON APPLIED SCIENCE AND TECHNOLOGY (ICAST'18), 2018, 2016
[6] Testing scientific software: A systematic literature review
Kanewala, Upulee
Bieman, James M.
[J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2014, 56 (10) : 1219 - 1232
[7] Ovarian reserve testing: systematic review of the literature
Gupta, Sajal
Sharma, Dipika
Surti, Nilopher
Kesavan, Shubhangi
Khanna, Pallavi
Agarwal, Ashok
[J]. ARCHIVES OF MEDICAL SCIENCE, 2009, 5 (1A) : S143 - S150
[8] How is genetic testing evaluated? A systematic review of the literature
Erica Pitini
Corrado De Vito
Carolina Marzuillo
Elvira D’Andrea
Annalisa Rosso
Antonio Federici
Emilio Di Maria
Paolo Villari
[J]. European Journal of Human Genetics, 2018, 26 : 605 - 615
[9] Personal utility in genomic testing: a systematic literature review
Jennefer N Kohler
Erin Turbitt
Barbara B Biesecker
[J]. European Journal of Human Genetics, 2017, 25 : 662 - 668
[10] Web Service Testing Techniques: A Systematic Literature Review
Ghani, Israr
Wan-Kadir, Wan M. N.
Mustafa, Ahmad
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (08) : 443 - 458

← 1 2 3 4 5 →