Validating Unsupervised Machine Learning Techniques for Software Defect Prediction With Generic Metamorphic Testing

被引:0
|
作者
Chan, Pak Yuen Patrick [1 ]
Keung, Jacky [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Software; Predictive models; Machine learning; Adaptation models; Testing; Data models; Machine learning algorithms; Numerical models; Feature extraction; Mathematical models; Clustering methods; Defect detection; Clustering; machine learning; metamorphic relation; metamorphic testing; software defect prediction; validation;
D O I
10.1109/ACCESS.2024.3494044
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the realm of software defect prediction, unsupervised models often step in when labelled datasets are scarce, despite facing the challenge of validating models without prior knowledge of data. Addressing this, we proposed a novel approach leveraging generic metamorphic testing to validate such models effectively, bypassing the need for expert-derived metamorphic relations. Our method identifies five categories of generic metamorphic relations, further divided into twenty-one individual generic metamorphic relations, all formulated through generic Data Mutation Operators. This framework enables us to generate follow-up datasets from the source datasets, training respective software defect prediction models. By comparing predictions between the source and follow-up software defect prediction models and identifying inconsistencies, we can assess the model's sensitivity to generic metamorphic relations as a form of validation. This approach was rigorously evaluated across twenty software defect prediction models, incorporating fourteen different machine learning algorithms and twenty high-dimensional public datasets. Remarkably, the robustness of our generic MT model was confirmed, showing substantial effectiveness in validating software defect prediction models, independent of whether they were supervised or unsupervised. Software defect prediction models, using Agglomerative clustering and Density-Based Spatial Clustering of Applications with Noise algorithms, did not violate any metamorphic relation, and nineteen software defect prediction models did not significantly violate the generic metamorphic relation "Shrinkage and Expansion". Our findings suggest that employing generic metamorphic relations, especially "Shrinkage and Expansion", can universally enhance the validation of defect prediction models. Furthermore, our model can be applied to continuously monitor software defect prediction models.
引用
收藏
页码:165155 / 165172
页数:18
相关论文
共 50 条
  • [31] Lessons Learned from the Assessment of Software Defect Prediction on WLCG Software A Study with Unlabelled Datasets and Machine Learning Techniques
    Ronchieri, Elisabetta
    Canaparo, Marco
    Belgiovine, Mauro
    Salomoni, Davide
    Martelli, Barbara
    24TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS (CHEP 2019), 2020, 245
  • [32] Researcher Bias: The Use of Machine Learning in Software Defect Prediction
    Shepperd, Martin
    Bowes, David
    Hall, Tracy
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2014, 40 (06) : 603 - 616
  • [33] Software Defect Prediction Analysis Using Machine Learning Algorithms
    Singh, Praman Deep
    Chug, Anuradha
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE AND ENGINEERING (CONFLUENCE 2017), 2017, : 775 - 781
  • [34] MPT-embedding: An unsupervised representation learning of code for software defect prediction
    Shi, Ke
    Lu, Yang
    Liu, Guangliang
    Wei, Zhenchun
    Chang, Jingfei
    JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2021, 33 (04)
  • [35] Class Imbalance Issue in Software Defect Prediction Models by various Machine Learning Techniques: An Empirical Study
    Pandey, Sushant Kumar
    Tripathi, Anil Kumar
    2021 8TH INTERNATIONAL CONFERENCE ON SMART COMPUTING AND COMMUNICATIONS (ICSCC), 2021, : 58 - 63
  • [36] Revisiting Unsupervised Learning for Defect Prediction
    Fu, Wei
    Menzies, Tim
    ESEC/FSE 2017: PROCEEDINGS OF THE 2017 11TH JOINT MEETING ON FOUNDATIONS OF SOFTWARE ENGINEERING, 2017, : 72 - 83
  • [37] Comprehensive Study on Machine Learning Techniques for Software Bug Prediction
    Khleel, Nasraldeen Alnor Adam
    Nehez, Karoly
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (08) : 726 - 735
  • [38] Machine learning techniques for software vulnerability prediction: a comparative study
    Gul Jabeen
    Sabit Rahim
    Wasif Afzal
    Dawar Khan
    Aftab Ahmed Khan
    Zahid Hussain
    Tehmina Bibi
    Applied Intelligence, 2022, 52 : 17614 - 17635
  • [39] A survey on machine learning techniques used for software quality prediction
    Pattnaik S.
    Pattanayak B.K.
    International Journal of Reasoning-based Intelligent Systems, 2016, 8 (1-2) : 3 - 14
  • [40] A Study on Software Effort Prediction Using Machine Learning Techniques
    Zhang, Wen
    Yang, Ye
    Wang, Qing
    EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, ENASE 2011, 2013, 275 : 1 - 15