Equivalence hypothesis testing in experimental software engineering

被引：5

作者：

Javier Dolado, Jose ^{[1
]}

Carmen Otero, Mari ^{[2
]}

Harman, Mark ^{[3
]}

机构：

[1] UPV EHU Univ Basque Country, Fac Informat, San Sebastian, Spain

[2] UPV EHU Univ Basque Country, Escuela Univ Ingn Vitoria Gasteiz, Vitoria, Spain

[3] UCL, CREST, London WC1E 6BT, England

来源：

SOFTWARE QUALITY JOURNAL | 2014年 / 22卷 / 02期

关键词：

Equivalence hypothesis testing; Bioequivalence analysis; Program comprehension; Side-effect free programs; Crossover design; Experimental software engineering; CONFIDENCE-INTERVALS; MODEL VALIDATION; POWER; DIFFERENCE;

D O I：

10.1007/s11219-013-9196-0

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

This article introduces the application of equivalence hypothesis testing (EHT) into the Empirical Software Engineering field. Equivalence (also known as bioequivalence in pharmacological studies) is a statistical approach that answers the question "is product T equivalent to some other reference product R within some range ?." The approach of "null hypothesis significance test" used traditionally in Empirical Software Engineering seeks to assess evidence for differences between T and R, not equivalence. In this paper, we explain how EHT can be applied in Software Engineering, thereby extending it from its current application within pharmacological studies, to Empirical Software Engineering. We illustrate the application of EHT to Empirical Software Engineering, by re-examining the behavior of experts and novices when handling code with side effects compared to side-effect free code; a study previously investigated using traditional statistical testing. We also review two other previous published data of software engineering experiments: a dataset compared the comprehension of UML and OML specifications, and the last dataset studied the differences between the specification methods UML-B and B. The application of EHT allows us to extract additional conclusions to the previous results. EHT has an important application in Empirical Software Engineering, which motivate its wider adoption and use: EHT can be used to assess the statistical confidence with which we can claim that two software engineering methods, algorithms of techniques, are equivalent.

引用

下载

页码：215 / 238

页数：24

共 50 条

[21] HYPOTHESIS-TESTING AND SOCIAL ENGINEERING - A COMMENT
CRONK, L
BEHAVIORAL AND BRAIN SCIENCES, 1991, 14 (02) : 305 - 305
[22] Multiple hypothesis testing in experimental economics
John A. List
Azeem M. Shaikh
Yang Xu
Experimental Economics, 2019, 22 : 773 - 793
[23] Multiple hypothesis testing in experimental economics
List, John A.
Shaikh, Azeem M.
Xu, Yang
EXPERIMENTAL ECONOMICS, 2019, 22 (04) : 773 - 793
[24] Introduction to software reliability engineering and testing
Musa, JD
EIGHTH INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING - CASE STUDIES, PROCEEDINGS, 1997, : 3 - 12
[25] Usability Testing: A Software Engineering Perspective
Bandi, Ajay
Heeler, Phil
2013 INTERNATIONAL CONFERENCE ON HUMAN COMPUTER INTERACTIONS (ICHCI), 2013,
[26] TESTING: A CENTRAL ISSUE IN SOFTWARE ENGINEERING
Hinsen, Konrad
COMPUTING IN SCIENCE & ENGINEERING, 2012, 14 (04) : 70 - 70
[27] Sequential multi-hypothesis testing in software reliability
Shieh, JS
Tong, YL
LIFETIME DATA: MODELS IN RELIABILITY AND SURVIVAL ANALYSIS, 1996, : 291 - 298
[28] μTOSS - Multiple hypothesis testing in an open software system
Blanchard, Gilles
Dickhaus, Thorsten
Hack, Niklas
Konietschke, Frank
Rohmeyer, Kornelius
Rosenblatt, Jonathan
Scheer, Marsel
Werft, Wiebke
PROCEEDINGS OF THE FIRST WORKSHOP ON APPLICATIONS OF PATTERN ANALYSIS, 2010, 11 : 12 - 19
[29] A note on conventional null hypothesis testing in active control equivalence studies
Hauschke, D
Steinijans, VW
CONTROLLED CLINICAL TRIALS, 1996, 17 (04): : 347 - 349
[30] Outperformance portfolio optimization via the equivalence of pure and randomized hypothesis testing
Leung, Tim
Song, Qingshuo
Yang, Jie
FINANCE AND STOCHASTICS, 2013, 17 (04) : 839 - 870

← 1 2 3 4 5 →