A neighborhood rough sets-based ensemble method, with application to software fault prediction

被引:0
|
作者
Jiang, Feng [1 ]
Hu, Qiang [1 ]
Yang, Zhiyong [1 ]
Liu, Jinhuan [2 ]
Du, Junwei [2 ]
机构
[1] Qingdao Univ Sci & Technol, Coll Informat Sci & Technol, Qingdao 266061, Peoples R China
[2] Qingdao Univ Sci & Technol, Sch Data Sci, Qingdao 266061, Peoples R China
关键词
Ensemble learning; Software fault prediction; Neighborhood rough sets; Reduct; Neighborhood approximate reduct; Imbalanced data; SYSTEM;
D O I
10.1016/j.eswa.2024.125919
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Software fault prediction (SFP) aims to detect fault-prone software modules, which is beneficial for allocating software testing resources and improving software quality. Recently, ensemble learning(EL)-based SFP methods have attracted much attention. Although many EL algorithms have been applied to SFP, they are still insufficient to generate multiple accurate and diverse base learners. Therefore, this paper presents a multi-modal EL algorithm (called NRSEL) based on neighborhood rough sets. In NRSEL, the technique of neighborhood approximate reduct (NAR) is used to implement the perturbation of attribute space and the bootstrap sampling technique is used to implement the perturbation of sample space. Asa novel technique for the perturbation of attribute space, NAR stems from the concept of approximate reduct in rough sets. We also consider the application of NRSEL to SFP, and employ a hybrid scheme (called SMOTE-NRSEL) to handle the problem of imbalanced data in SFP. We compare SMOTE-NRSEL with existing EL algorithms using 20 public datasets. Experimental results indicate that SMOTE-NRSEL is effective for SFP. Compared with the baseline algorithms, on average, SMOTE-NRSEL improves the AUC, F1-score, and MCC by 3.09%, 3.18%, and 7.5%, respectively. Moreover, the results of three statistical tests (including the paired t-test, Friedman test, and Nemenyi test) indicate that SMOTE-NRSEL is significantly better than the baseline algorithms inmost cases. This paper shows that NAR is a good choice for the perturbation of attribute space. With the help of NAR and the multi-modal perturbation strategy based on it, SMOTE-NRSEL can generate accurate and diverse base learners. The code is available at https://github.com/jiangfeng0278/NRSEL.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] A sequential ensemble model for software fault prediction
    Monika Mangla
    Nonita Sharma
    Sachi Nandan Mohanty
    Innovations in Systems and Software Engineering, 2022, 18 : 301 - 308
  • [32] A novel method to attribute reduction based on weighted neighborhood probabilistic rough sets
    Xie, Jingjing
    Hu, Bao Qing
    Jiang, Haibo
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2022, 144 : 1 - 17
  • [33] A new Ensemble approach for Software Fault Prediction
    Elahi, Ehsan
    Kanwal, Saima
    Asif, Ali Nouman
    PROCEEDINGS OF 2020 17TH INTERNATIONAL BHURBAN CONFERENCE ON APPLIED SCIENCES AND TECHNOLOGY (IBCAST), 2020, : 407 - 412
  • [34] A sequential ensemble model for software fault prediction
    Mangla, Monika
    Sharma, Nonita
    Mohanty, Sachi Nandan
    INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2022, 18 (02) : 301 - 308
  • [35] Rough Sets-Based Identification of Heart Valve Diseases Using Heart Sounds
    Salama, Mostafa A.
    Hassanien, Aboul Ella
    Platos, Jan
    Fahmy, Aly A.
    Snasel, Vaclav
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, PT I, 2012, 7208 : 667 - 676
  • [36] A random sets-based method for identifying fuzzy models
    Sanchez, L
    FUZZY SETS AND SYSTEMS, 1998, 98 (03) : 343 - 354
  • [37] Intelligent fault diagnosis of rolling bearing based on kernel neighborhood rough sets and statistical features
    Xiaoran Zhu
    Youyun Zhang
    Yongsheng Zhu
    Journal of Mechanical Science and Technology, 2012, 26 : 2649 - 2657
  • [38] Intelligent fault diagnosis of rolling bearing based on kernel neighborhood rough sets and statistical features
    Zhu, Xiaoran
    Zhang, Youyun
    Zhu, Yongsheng
    JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2012, 26 (09) : 2649 - 2657
  • [39] A Novel Intelligent Fault Diagnosis Method Using Entropy-based Rough Sets and Its Application
    Tian, Wei
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 5995 - 5999
  • [40] Switched Model Sets-Based Estimators for Mobile Localization in Rough NLOS Conditions
    Ho, Tan-Jan
    2013 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP 2013), 2013,