Cost Measures Matter for Mutation Testing Study Validity

被引:7
|
作者
Guizzo, Giovani [1 ]
Sarro, Federica [1 ]
Harman, Mark [1 ]
机构
[1] UCL, Dept Comp Sci, London, England
基金
欧洲研究理事会;
关键词
Software Testing; Mutation Testing; Mutation Analysis; Cost Reduction; Number of Mutants; Execution Time; Mutant Reduction;
D O I
10.1145/3368089.3409742
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Mutation testing research has often used the number of mutants as a surrogate measure for the true execution cost of generating and executing mutants. This poses a potential threat to the validity of the scientific findings reported in the literature. Out of 75 works surveyed in this paper, we found that 54 (72%) are vulnerable to this threat. To investigate the magnitude of the threat, we conducted an empirical evaluation using 10 real-world programs. The results reveal that: i) percentages of randomly sampled mutants differ from the true execution time, on average, by 44%, varying in difference from 19% to 91%; ii) errors arising from using the surrogate correlate with program size (rho = 0.74) and number of mutants (rho = 0.76), making the problem more pernicious for more realistic programs; iii) scientific findings concerning sampling strategies would have approximately 37% rank disagreement, indicating potentially dramatic impact on experiment validity. To investigate whether this threat matters in practice, we reproduced a seminal study on Selective Mutation (widely relied upon for more than two decades). The impact is stark: an inconclusive scientific finding using the surrogate is transformed to an unequivocal finding when using the true execution cost.
引用
收藏
页码:1127 / 1139
页数:13
相关论文
共 50 条
  • [41] Evaluation of the Prediction-Based Approach to Cost Reduction in Mutation Testing
    Strug, Joanna
    Strug, Barbara
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2018, PT II, 2019, 853 : 340 - 350
  • [42] A systematic literature review of techniques and metrics to reduce the cost of mutation testing
    Pizzoleto, Alessandro Viola
    Ferrari, Fabiano Cutigi
    Offutt, Jeff
    Fernandes, Leo
    Ribeiro, Marcio
    JOURNAL OF SYSTEMS AND SOFTWARE, 2019, 157
  • [43] Cost Reduction in Mutation Testing with Bytecode-Level Mutants Classification
    Strug, Joanna
    Strug, Barbara
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2018, PT I, 2018, 10841 : 714 - 723
  • [44] Validity and cost-effectiveness of antisperm antibody testing before in vitro fertilization
    Culligan, PJ
    Crane, MM
    Boone, WR
    Allen, TC
    Price, TM
    Blauer, KL
    FERTILITY AND STERILITY, 1998, 69 (05) : 894 - 898
  • [45] The Validity of d′ Measures
    Vermeiren, Astrid
    Cleeremans, Axel
    PLOS ONE, 2012, 7 (02):
  • [46] Testing the Validity of Campaign Ad Exposure Measures: A Family Planning Media Campaign in Pakistan
    Beaudoin, Christopher E.
    Stephenson, Michael T.
    Agha, Sohail
    JOURNAL OF HEALTH COMMUNICATION, 2016, 21 (07) : 773 - 781
  • [47] Testing the reliability and validity of the health-related quality of life measures for stroke survivors
    Nahm, ES
    Resnick, B
    Michael, KM
    Shaughnessy, M
    Kopunek, S
    STROKE, 2006, 37 (02) : 743 - 743
  • [48] Testing the convergent and discriminant validity of three implicit motive measures: PSE, OMT, and MMG
    Schueler, Julia
    Brandstaeter, Veronika
    Wegner, Mirko
    Baumann, Nicola
    MOTIVATION AND EMOTION, 2015, 39 (06) : 839 - 857
  • [49] Testing the convergent and discriminant validity of three implicit motive measures: PSE, OMT, and MMG
    Julia Schüler
    Veronika Brandstätter
    Mirko Wegner
    Nicola Baumann
    Motivation and Emotion, 2015, 39 : 839 - 857
  • [50] A reflection on cognitive reflection - testing convergent/divergent validity of two measures of cognitive reflection
    Erceg, Nikola
    Galic, Zvonimir
    Ruzojcic, Mitja
    JUDGMENT AND DECISION MAKING, 2020, 15 (05): : 741 - 755