Cost Measures Matter for Mutation Testing Study Validity

被引:7
|
作者
Guizzo, Giovani [1 ]
Sarro, Federica [1 ]
Harman, Mark [1 ]
机构
[1] UCL, Dept Comp Sci, London, England
基金
欧洲研究理事会;
关键词
Software Testing; Mutation Testing; Mutation Analysis; Cost Reduction; Number of Mutants; Execution Time; Mutant Reduction;
D O I
10.1145/3368089.3409742
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Mutation testing research has often used the number of mutants as a surrogate measure for the true execution cost of generating and executing mutants. This poses a potential threat to the validity of the scientific findings reported in the literature. Out of 75 works surveyed in this paper, we found that 54 (72%) are vulnerable to this threat. To investigate the magnitude of the threat, we conducted an empirical evaluation using 10 real-world programs. The results reveal that: i) percentages of randomly sampled mutants differ from the true execution time, on average, by 44%, varying in difference from 19% to 91%; ii) errors arising from using the surrogate correlate with program size (rho = 0.74) and number of mutants (rho = 0.76), making the problem more pernicious for more realistic programs; iii) scientific findings concerning sampling strategies would have approximately 37% rank disagreement, indicating potentially dramatic impact on experiment validity. To investigate whether this threat matters in practice, we reproduced a seminal study on Selective Mutation (widely relied upon for more than two decades). The impact is stark: an inconclusive scientific finding using the surrogate is transformed to an unequivocal finding when using the true execution cost.
引用
收藏
页码:1127 / 1139
页数:13
相关论文
共 50 条
  • [1] Validity of measures is no simple matter
    Sechrest, L
    HEALTH SERVICES RESEARCH, 2005, 40 (05) : 1584 - 1604
  • [2] REDUCING THE COST OF MUTATION TESTING - AN EMPIRICAL-STUDY
    WONG, WE
    MATHUR, AP
    JOURNAL OF SYSTEMS AND SOFTWARE, 1995, 31 (03) : 185 - 196
  • [3] Testing the validity of social capital measures in the study of information and communication technologies
    Appel, Lora
    Dadlani, Punit
    Dwyer, Maria
    Hampton, Keith
    Kitzie, Vanessa
    Matni, Ziad A.
    Moore, Patricia
    Teodoro, Rannie
    INFORMATION COMMUNICATION & SOCIETY, 2014, 17 (04) : 398 - 416
  • [4] Extended Firm Mutation Testing: A Cost Reduction Technique for Mutation Testing
    Singh, Mayank
    Srivastava, Viranjay M.
    2017 FOURTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2017, : 604 - 609
  • [5] Testing the validity of implicit measures of wanting and liking
    Tibboel, Helen
    De Houwer, Jan
    Spruyt, Adriaan
    Field, Matt
    Kemps, Eva
    Crombez, Geert
    JOURNAL OF BEHAVIOR THERAPY AND EXPERIMENTAL PSYCHIATRY, 2011, 42 (03) : 284 - 292
  • [6] Testing the validity of cost-effectiveness models
    McCabe, C
    Dixon, S
    PHARMACOECONOMICS, 2000, 17 (05) : 501 - 513
  • [7] Testing the Validity of Cost-Effectiveness Models
    Chris McCabe
    Simon Dixon
    PharmacoEconomics, 2000, 17 : 501 - 513
  • [8] Construct validity of averting cost measures of environmental benefits
    Laughland, AS
    Musser, WN
    Shortle, JS
    Musser, LM
    LAND ECONOMICS, 1996, 72 (01) : 100 - 112
  • [9] Correlation among measures of balance and validity in neuropsychological testing
    Lima, E.
    Hartline, K.
    Pawlenko, N.
    Patel, A.
    Riopelle, L.
    Herrera-Hamilton, A.
    CLINICAL NEUROPSYCHOLOGIST, 2016, 30 (03) : 434 - 435
  • [10] Performance validity testing in children and adolescents: A descriptive study comparing direct and embedded measures
    Weiss, Stephanie J.
    Blackwell, Melissa C.
    Griffith, Kirk M.
    Jordan, Leslie S.
    Culotta, Vincent P.
    APPLIED NEUROPSYCHOLOGY-CHILD, 2019, 8 (02) : 158 - 162