Syntactic Versus Semantic Similarity of Artificial and Real Faults in Mutation Testing Studies

被引：4

作者：

Ojdanic, Milos ^{[1
]}

Garg, Aayush ^{[1
]}

Khanfir, Ahmed ^{[1
]}

Degiovanni, Renzo ^{[1
]}

Papadakis, Mike ^{[1
]}

Le Traon, Yves ^{[2
]}

机构：

[1] Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust SnT, L-1359 Esch Sur Alzette, Luxembourg

[2] Univ Luxembourg, L-1359 Esch Sur Alzette, Luxembourg

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2023年 / 49卷 / 07期

关键词：

Fault injection; fault seeding; machine learning; mutation testing; semantic model; syntactic distance; BUGS;

D O I：

10.1109/TSE.2023.3277564

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Fault seeding is typically used in empirical studies to evaluate and compare test techniques. Central to these techniques lies the hypothesis that artificially seeded faults involve some form of realistic properties and thus provide realistic experimental results. In an attempt to strengthen realism, a recent line of research uses machine learning techniques, such as deep learning and Natural Language Processing, to seed faults that look like (syntactically) real ones, implying that fault realism is related to syntactic similarity. This raises the question of whether seeding syntactically similar faults indeed results in semantically similar faults and, more generally whether syntactically dissimilar faults are far away (semantically) from the real ones. We answer this question by employing 4 state-of-the-art fault-seeding techniques (PiTest - a popular mutation testing tool, IBIR - a tool with manually crafted fault patterns, DeepMutation - a learning-based fault seeded framework and mu BERT - a mutation testing tool based on the pre-trained language model CodeBERT) that operate in a fundamentally different way, and demonstrate that syntactic similarity does not reflect semantic similarity. We also show that 65.11%, 76.44%, 61.39% and 9.76% of the real faults of Defects4J V2 are semantically resembled by PiTest, IBIR, mu BERT and DeepMutation faults, respectively.

引用

页码：3922 / 3938

页数：17

共 7 条

[1] Testing the role of semantic similarity in syntactic development
Ninio, A
JOURNAL OF CHILD LANGUAGE, 2005, 32 (01) : 35 - 61
[2] Mutation Testing of Deep Reinforcement Learning Based on Real Faults
Tambon, Florian
Majdinasab, Vahid
Nikanjam, Amin
Khomh, Foutse
Antoniol, Giuliano
2023 IEEE CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION, ICST, 2023, : 188 - 198
[3] DeepCrime: Mutation Testing of Deep Learning Systems Based on Real Faults
Humbatova, Nargiz
Jahangirova, Gunel
Tonella, Paolo
ISSTA '21: PROCEEDINGS OF THE 30TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, 2021, : 67 - 78
[4] DeepCrime: from Real Faults to Mutation Testing Tool for Deep Learning
Humbatova, Nargiz
Jahangirova, Gunel
Tonella, Paolo
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS, ICSE-COMPANION, 2023, : 68 - 72
[5] How effective are mutation testing tools? An empirical analysis of Java mutation testing tools with manual analysis and real faults
Marinos Kintis
Mike Papadakis
Andreas Papadopoulos
Evangelos Valvis
Nicos Malevris
Yves Le Traon
Empirical Software Engineering, 2018, 23 : 2426 - 2463
[6] How effective are mutation testing tools? An empirical analysis of Java']Java mutation testing tools with manual analysis and real faults
Kintis, Marinos
Papadakis, Mike
Papadopoulos, Andreas
Valvis, Evangelos
Malevris, Nicos
Le Traon, Yves
EMPIRICAL SOFTWARE ENGINEERING, 2018, 23 (04) : 2426 - 2463
[7] Economic analysis of BRAF gene mutation testing in real world practice using claims data: costs of single gene versus panel tests in patients with lung cancer
Dalal, Anand A.
Guerin, Annie
Mutebi, Alex
Culver, Kenneth W.
JOURNAL OF MEDICAL ECONOMICS, 2018, 21 (07) : 649 - 655

← 1 →