Validating Unsupervised Machine Learning Techniques for Software Defect Prediction With Generic Metamorphic Testing

被引:0
|
作者
Chan, Pak Yuen Patrick [1 ]
Keung, Jacky [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Software; Predictive models; Machine learning; Adaptation models; Testing; Data models; Machine learning algorithms; Numerical models; Feature extraction; Mathematical models; Clustering methods; Defect detection; Clustering; machine learning; metamorphic relation; metamorphic testing; software defect prediction; validation;
D O I
10.1109/ACCESS.2024.3494044
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the realm of software defect prediction, unsupervised models often step in when labelled datasets are scarce, despite facing the challenge of validating models without prior knowledge of data. Addressing this, we proposed a novel approach leveraging generic metamorphic testing to validate such models effectively, bypassing the need for expert-derived metamorphic relations. Our method identifies five categories of generic metamorphic relations, further divided into twenty-one individual generic metamorphic relations, all formulated through generic Data Mutation Operators. This framework enables us to generate follow-up datasets from the source datasets, training respective software defect prediction models. By comparing predictions between the source and follow-up software defect prediction models and identifying inconsistencies, we can assess the model's sensitivity to generic metamorphic relations as a form of validation. This approach was rigorously evaluated across twenty software defect prediction models, incorporating fourteen different machine learning algorithms and twenty high-dimensional public datasets. Remarkably, the robustness of our generic MT model was confirmed, showing substantial effectiveness in validating software defect prediction models, independent of whether they were supervised or unsupervised. Software defect prediction models, using Agglomerative clustering and Density-Based Spatial Clustering of Applications with Noise algorithms, did not violate any metamorphic relation, and nineteen software defect prediction models did not significantly violate the generic metamorphic relation "Shrinkage and Expansion". Our findings suggest that employing generic metamorphic relations, especially "Shrinkage and Expansion", can universally enhance the validation of defect prediction models. Furthermore, our model can be applied to continuously monitor software defect prediction models.
引用
收藏
页码:165155 / 165172
页数:18
相关论文
共 50 条
  • [1] METTLE: A METamorphic Testing Approach to Assessing and Validating Unsupervised Machine Learning Systems
    Xie, Xiaoyuan
    Zhang, Zhiyi
    Chen, Tsong Yueh
    Liu, Yang
    Poon, Pak-Lok
    Xu, Baowen
    IEEE TRANSACTIONS ON RELIABILITY, 2020, 69 (04) : 1293 - 1322
  • [2] A systematic review of unsupervised learning techniques for software defect prediction
    Li, Ning
    Shepperd, Martin
    Guo, Yuchen
    INFORMATION AND SOFTWARE TECHNOLOGY, 2020, 122 (122)
  • [3] Testing and validating machine learning classifiers by metamorphic testing
    Xie, Xiaoyuan
    Ho, Joshua W. K.
    Murphy, Christian
    Kaiser, Gail
    Xu, Baowen
    Chen, Tsong Yueh
    JOURNAL OF SYSTEMS AND SOFTWARE, 2011, 84 (04) : 544 - 558
  • [4] Machine learning techniques for software testing effort prediction
    Lopez-Martin, Cuauhtemoc
    SOFTWARE QUALITY JOURNAL, 2022, 30 (01) : 65 - 100
  • [5] Machine learning techniques for software testing effort prediction
    Cuauhtémoc López-Martín
    Software Quality Journal, 2022, 30 : 65 - 100
  • [6] Machine Learning Techniques for Escaped Defect Analysis in Software Testing
    Nascimento, Lidia P. G.
    Prudencio, Ricardo B. C.
    Mota, Alexandre C.
    Paiva Filho, Audir A.
    Cruz, Pedro H. A.
    de Oliveira, Daniel C. C. A.
    Moreira, Pedro R. S.
    PROCEEDINGS OF THE 8TH BRAZILIAN SYMPOSIUM ON SYSTEMATIC AND AUTOMATED SOFT-WARE TESTING, SAST 2023, 2023, : 47 - 53
  • [7] Software Defect Prediction Analysis Using Machine Learning Techniques
    Khalid, Aimen
    Badshah, Gran
    Ayub, Nasir
    Shiraz, Muhammad
    Ghouse, Mohamed
    SUSTAINABILITY, 2023, 15 (06)
  • [8] Software Defect Prediction on Unlabelled Dataset with Machine Learning Techniques
    Ronchieri, Elisabetta
    Canaparo, Marco
    Belgiovine, Mauro
    Salomoni, Davide
    2019 IEEE NUCLEAR SCIENCE SYMPOSIUM AND MEDICAL IMAGING CONFERENCE (NSS/MIC), 2019,
  • [9] Empirical assessment of machine learning based software defect prediction techniques
    Challagulla, VUB
    Bastani, FB
    Yen, IL
    Paul, RA
    WORDS 2005: 10TH IEEE INTERNATIONAL WORKSHOP ON OBJECT-ORIENTED REAL-TIME DEPENDABLE, PROCEEDINGS, 2005, : 263 - 270
  • [10] Empirical assessment of machine learning based software defect prediction techniques
    Challagulla, Venkata Udaya B.
    Bastani, Farokh B.
    Yen, I-Ling
    Paul, Raymond A.
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2008, 17 (02) : 389 - 400