Syntactic Versus Semantic Similarity of Artificial and Real Faults in Mutation Testing Studies

被引:4
|
作者
Ojdanic, Milos [1 ]
Garg, Aayush [1 ]
Khanfir, Ahmed [1 ]
Degiovanni, Renzo [1 ]
Papadakis, Mike [1 ]
Le Traon, Yves [2 ]
机构
[1] Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust SnT, L-1359 Esch Sur Alzette, Luxembourg
[2] Univ Luxembourg, L-1359 Esch Sur Alzette, Luxembourg
关键词
Fault injection; fault seeding; machine learning; mutation testing; semantic model; syntactic distance; BUGS;
D O I
10.1109/TSE.2023.3277564
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Fault seeding is typically used in empirical studies to evaluate and compare test techniques. Central to these techniques lies the hypothesis that artificially seeded faults involve some form of realistic properties and thus provide realistic experimental results. In an attempt to strengthen realism, a recent line of research uses machine learning techniques, such as deep learning and Natural Language Processing, to seed faults that look like (syntactically) real ones, implying that fault realism is related to syntactic similarity. This raises the question of whether seeding syntactically similar faults indeed results in semantically similar faults and, more generally whether syntactically dissimilar faults are far away (semantically) from the real ones. We answer this question by employing 4 state-of-the-art fault-seeding techniques (PiTest - a popular mutation testing tool, IBIR - a tool with manually crafted fault patterns, DeepMutation - a learning-based fault seeded framework and mu BERT - a mutation testing tool based on the pre-trained language model CodeBERT) that operate in a fundamentally different way, and demonstrate that syntactic similarity does not reflect semantic similarity. We also show that 65.11%, 76.44%, 61.39% and 9.76% of the real faults of Defects4J V2 are semantically resembled by PiTest, IBIR, mu BERT and DeepMutation faults, respectively.
引用
收藏
页码:3922 / 3938
页数:17
相关论文
共 7 条
  • [1] Testing the role of semantic similarity in syntactic development
    Ninio, A
    JOURNAL OF CHILD LANGUAGE, 2005, 32 (01) : 35 - 61
  • [2] Mutation Testing of Deep Reinforcement Learning Based on Real Faults
    Tambon, Florian
    Majdinasab, Vahid
    Nikanjam, Amin
    Khomh, Foutse
    Antoniol, Giuliano
    2023 IEEE CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION, ICST, 2023, : 188 - 198
  • [3] DeepCrime: Mutation Testing of Deep Learning Systems Based on Real Faults
    Humbatova, Nargiz
    Jahangirova, Gunel
    Tonella, Paolo
    ISSTA '21: PROCEEDINGS OF THE 30TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, 2021, : 67 - 78
  • [4] DeepCrime: from Real Faults to Mutation Testing Tool for Deep Learning
    Humbatova, Nargiz
    Jahangirova, Gunel
    Tonella, Paolo
    2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 68 - 72
  • [5] How effective are mutation testing tools? An empirical analysis of Java mutation testing tools with manual analysis and real faults
    Marinos Kintis
    Mike Papadakis
    Andreas Papadopoulos
    Evangelos Valvis
    Nicos Malevris
    Yves Le Traon
    Empirical Software Engineering, 2018, 23 : 2426 - 2463
  • [6] How effective are mutation testing tools? An empirical analysis of Java']Java mutation testing tools with manual analysis and real faults
    Kintis, Marinos
    Papadakis, Mike
    Papadopoulos, Andreas
    Valvis, Evangelos
    Malevris, Nicos
    Le Traon, Yves
    EMPIRICAL SOFTWARE ENGINEERING, 2018, 23 (04) : 2426 - 2463
  • [7] Economic analysis of BRAF gene mutation testing in real world practice using claims data: costs of single gene versus panel tests in patients with lung cancer
    Dalal, Anand A.
    Guerin, Annie
    Mutebi, Alex
    Culver, Kenneth W.
    JOURNAL OF MEDICAL ECONOMICS, 2018, 21 (07) : 649 - 655