Validating Unsupervised Machine Learning Techniques for Software Defect Prediction With Generic Metamorphic Testing

被引:0
|
作者
Chan, Pak Yuen Patrick [1 ]
Keung, Jacky [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Software; Predictive models; Machine learning; Adaptation models; Testing; Data models; Machine learning algorithms; Numerical models; Feature extraction; Mathematical models; Clustering methods; Defect detection; Clustering; machine learning; metamorphic relation; metamorphic testing; software defect prediction; validation;
D O I
10.1109/ACCESS.2024.3494044
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the realm of software defect prediction, unsupervised models often step in when labelled datasets are scarce, despite facing the challenge of validating models without prior knowledge of data. Addressing this, we proposed a novel approach leveraging generic metamorphic testing to validate such models effectively, bypassing the need for expert-derived metamorphic relations. Our method identifies five categories of generic metamorphic relations, further divided into twenty-one individual generic metamorphic relations, all formulated through generic Data Mutation Operators. This framework enables us to generate follow-up datasets from the source datasets, training respective software defect prediction models. By comparing predictions between the source and follow-up software defect prediction models and identifying inconsistencies, we can assess the model's sensitivity to generic metamorphic relations as a form of validation. This approach was rigorously evaluated across twenty software defect prediction models, incorporating fourteen different machine learning algorithms and twenty high-dimensional public datasets. Remarkably, the robustness of our generic MT model was confirmed, showing substantial effectiveness in validating software defect prediction models, independent of whether they were supervised or unsupervised. Software defect prediction models, using Agglomerative clustering and Density-Based Spatial Clustering of Applications with Noise algorithms, did not violate any metamorphic relation, and nineteen software defect prediction models did not significantly violate the generic metamorphic relation "Shrinkage and Expansion". Our findings suggest that employing generic metamorphic relations, especially "Shrinkage and Expansion", can universally enhance the validation of defect prediction models. Furthermore, our model can be applied to continuously monitor software defect prediction models.
引用
收藏
页码:165155 / 165172
页数:18
相关论文
共 50 条
  • [41] Machine Learning Techniques for Software Maintainability Prediction: Accuracy Analysis
    Sara Elmidaoui
    Laila Cheikhi
    Ali Idri
    Alain Abran
    Journal of Computer Science and Technology, 2020, 35 : 1147 - 1174
  • [42] Machine Learning Techniques for Software Maintainability Prediction: Accuracy Analysis
    Elmidaoui, Sara
    Cheikhi, Laila
    Idri, Ali
    Abran, Alain
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (05) : 1147 - 1174
  • [43] Machine learning techniques for software vulnerability prediction: a comparative study
    Jabeen, Gul
    Rahim, Sabit
    Afzal, Wasif
    Khan, Dawar
    Khan, Aftab Ahmed
    Hussain, Zahid
    Bibi, Tehmina
    APPLIED INTELLIGENCE, 2022, 52 (15) : 17614 - 17635
  • [44] A systematic review of machine learning techniques for software fault prediction
    Malhotra, Ruchika
    APPLIED SOFT COMPUTING, 2015, 27 : 504 - 518
  • [45] On the Defect Prediction for Large Scale Software Systems - From Defect Density to Machine Learning
    Pradhan, Satya
    Nanniyur, Venky
    Vissapragada, Pavan K.
    2020 IEEE 20TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY, AND SECURITY (QRS 2020), 2020, : 374 - 381
  • [46] Software defect prediction: A study on software metrics using statistical and machine learning methods
    Canaparo, Marco
    Ronchierr, Elisabetta
    Bertaccini, Gianluca
    INTERNATIONAL SYMPOSIUM ON GRIDS & CLOUDS 2022, 2022,
  • [47] Software Defect Prediction Model Based on the Combination of Machine Learning Algorithms
    Fu Y.
    Dong W.
    Yin L.
    Du Y.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2017, 54 (03): : 633 - 641
  • [48] Comments on "Researcher Bias: The Use of Machine Learning in Software Defect Prediction"
    Tantithamthavorn, Chakkrit
    McIntosh, Shane
    Hassan, Ahmed E.
    Matsumoto, Kenichi
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2016, 42 (11) : 1092 - 1094
  • [49] Automating Fault Prediction in Software Testing using Machine Learning Techniques: A Real-World Applications
    Panda, Prasanta
    Sahoo, Debaryaan
    Sahoo, Debarjun
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 841 - 844
  • [50] Transfer Learning Code Vectorizer based Machine Learning Models for Software Defect Prediction
    Singh, Rituraj
    Singh, Jasmeet
    Gill, Mehrab Singh
    Malhotra, Ruchika
    Garima
    2020 INTERNATIONAL CONFERENCE ON COMPUTATIONAL PERFORMANCE EVALUATION (COMPE-2020), 2020, : 497 - 502