A training sample selection method for predicting software defects

被引：0

作者：

Cong Jin

机构：

[1] Central China Normal University,School of Computer

来源：

Applied Intelligence | 2023年 / 53卷

关键词：

Software defect prediction; Sample contribution; Sample selection; Predictive performance;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Software Defect Prediction (SDP) is an important method to analyze software quality and reduce development cost. Data from software life cycle has been widely used to predict the defect prone of software modules, and although many machine learning-based SDP models have been proposed, their predictive performance is not always satisfactory. Traditional machine learning-based classifiers usually assume that all samples have the same contribution to the training of SDP, which is not true. In fact, different training samples have different effects on the performance of the SDP model, the performance of machine learning-based SDP models is heavily dependent on the quality of training samples. For the above shortcoming of traditional machine learning-based classifiers, the contributions of this paper are as follows: (1) Inspired by the clustering algorithm, a method to calculate the contribution of each training sample to the SDP model is proposed, which not only considers the relationship between the contributions of the training samples to the SDP model, and also analyzes the influence of the distance between the sample and the category boundary on the performance of the SDP model, so it is different from the existing calculation method of sample contribution. (2) A Sample Selection (SS) method is proposed to improve the performance of the SDP model. It first calculates the contribution of each training sample based on several nearest neighbors of the sample and the label information of these neighbors, and then implements SS according to Hoeffding probability inequality and the contribution of each sample. To confirm the validity of the proposed SDP model, some experimental results are given. Both direct observations and statistical tests of the experimental results show that the SS method is very effective for improving the predictive performance of the SDP model.

引用

页码：12015 / 12031

页数：16

共 50 条

[1] A training sample selection method for predicting software defects
Jin, Cong
[J]. APPLIED INTELLIGENCE, 2023, 53 (10) : 12015 - 12031
[2] A method for predicting open source software residual defects
Ullah, Najeeb
[J]. SOFTWARE QUALITY JOURNAL, 2015, 23 (01) : 55 - 76
[3] A method for predicting open source software residual defects
Najeeb Ullah
[J]. Software Quality Journal, 2015, 23 : 55 - 76
[4] Hybrid feature selection method for predicting software defect
A. J. Anju
J. E. Judith
[J]. Journal of Engineering and Applied Science, 2024, 71 (1):
[5] Predicting software defects with causality tests
Couto, Cesar
Pires, Pedro
Valente, Marco Tulio
Bigonha, Roberto S.
Anquetil, Nicolas
[J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2014, 93 : 24 - 41
[6] A Highly Efficient Method for Training Sample Selection in Remote Sensing Classification
Yang, Chao
Li, Qingquan
Wu, Guofeng
Chen, Junyi
[J]. 2018 26TH INTERNATIONAL CONFERENCE ON GEOINFORMATICS (GEOINFORMATICS 2018), 2018,
[7] Predicting the number of defects remaining in operational software
Hartman, PJ
[J]. NAVAL ENGINEERS JOURNAL, 2001, 113 (01) : 23 - 32
[8] Predicting the number of defects in a new software version
Felix, Ebubeogu Amarachukwu
Lee, Sai Peck
[J]. PLOS ONE, 2020, 15 (03):
[9] Predicting Software Defects with Explainable Machine Learning
Santos, Geanderson
Figueiredo, Eduardo
Veloso, Adriano
Viggiato, Markos
Ziviani, Nivio
[J]. PROCEEDINGS OF THE 19TH BRAZILIAN SYMPOSIUM ON SOFTWARE QUALITY, SBOS 2020, 2020,
[10] A Novel Training Sample Selection Method for STAP Based on Clutter Sparse Recovery
Han, Sudan
Fan, Chongyi
Huang, Xiaotao
[J]. 2016 PROGRESS IN ELECTROMAGNETICS RESEARCH SYMPOSIUM (PIERS), 2016, : 2275 - 2279

← 1 2 3 4 5 →