The limited impact of individual developer data on software defect prediction

被引：0

作者：

Robert M. Bell

Thomas J. Ostrand

Elaine J. Weyuker

机构：

[1] AT&T Labs - Research,

来源：

Empirical Software Engineering | 2013年 / 18卷

关键词：

Software faults; Buggy file ratio; Fault-prone; Prediction; Fault-percentile average; Regression model; Empirical study; D.2.5;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Previous research has provided evidence that a combination of static code metrics and software history metrics can be used to predict with surprising success which files in the next release of a large system will have the largest numbers of defects. In contrast, very little research exists to indicate whether information about individual developers can profitably be used to improve predictions. We investigate whether files in a large system that are modified by an individual developer consistently contain either more or fewer faults than the average of all files in the system. The goal of the investigation is to determine whether information about which particular developer modified a file is able to improve defect predictions. We also extend earlier research evaluating use of counts of the number of developers who modified a file as predictors of the file’s future faultiness. We analyze change reports filed for three large systems, each containing 18 releases, with a combined total of nearly 4 million LOC and over 11,000 files. A buggy file ratio is defined for programmers, measuring the proportion of faulty files in Release R out of all files modified by the programmer in Release R-1. We assess the consistency of the buggy file ratio across releases for individual programmers both visually and within the context of a fault prediction model. Buggy file ratios for individual programmers often varied widely across all the releases that they participated in. A prediction model that takes account of the history of faulty files that were changed by individual developers shows improvement over the standard negative binomial model of less than 0.13% according to one measure, and no improvement at all according to another measure. In contrast, augmenting a standard model with counts of cumulative developers changing files in prior releases produced up to a 2% improvement in the percentage of faults detected in the top 20% of predicted faulty files. The cumulative number of developers interacting with a file can be a useful variable for defect prediction. However, the study indicates that adding information to a model about which particular developer modified a file is not likely to improve defect predictions.

引用

页码：478 / 505

页数：27

共 50 条

[1] The limited impact of individual developer data on software defect prediction
Bell, Robert M.
Ostrand, Thomas J.
Weyuker, Elaine J.
[J]. EMPIRICAL SOFTWARE ENGINEERING, 2013, 18 (03) : 478 - 505
[2] Periodic Developer Metrics in Software Defect Prediction
Kini, Seldag Ozcan
Tosun, Ayse
[J]. 2018 IEEE 18TH INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM), 2018, : 72 - 81
[3] Developer Micro Interaction Metrics for Software Defect Prediction
Lee, Taek
Nam, Jaechang
Han, Donggyun
Kim, Sunghun
In, Hoh Peter
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2016, 42 (11) : 1015 - 1035
[4] Impact of Data Sampling on Feature Selection Techniques for Software Defect Prediction
Gao, Kehan
Khoshgoftaar, Taghi M.
Napolitano, Amri
[J]. PROCEEDINGS 18TH ISSAT INTERNATIONAL CONFERENCE ON RELIABILITY & QUALITY IN DESIGN, 2012, : 91 - +
[5] Software Defect Prediction with Skewed Data
Seliya, Naeem
Khoshgoftaar, Taghi M.
[J]. 16TH ISSAT INTERNATIONAL CONFERENCE ON RELIABILITY AND QUALITY IN DESIGN, 2010, : 403 - +
[6] Impact of the Distribution Parameter of Data Sampling Approaches on Software Defect Prediction Models
Bennin, Kwabena Ebo
Keung, Jacky
Monden, Akito
[J]. 2017 24TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2017), 2017, : 630 - 635
[7] Impact of Types of Change on Software Defect Prediction
Erdem, Atakan
[J]. INTELLIGENT COMPUTING, VOL 2, 2021, 284 : 273 - 283
[8] Evaluating the Impact of Data Transformation Techniques on the Performance and Interpretability of Software Defect Prediction Models
Zhao, Yu
Huang, Zhiqiu
Gong, Lina
Zhu, Yi
Yu, Qiao
Gao, Yuxiang
[J]. IET SOFTWARE, 2023, 2023
[9] Early Software Defect Prediction: Right-Shifting Software Effort Data into a Defect Curve
Okumoto, Kazuhira
[J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW 2022), 2022, : 43 - 48
[10] Software Defect Prediction Based on Stability Test Data
Okumoto, Kazu
[J]. 2011 INTERNATIONAL CONFERENCE ON QUALITY, RELIABILITY, RISK, MAINTENANCE, AND SAFETY ENGINEERING (ICQR2MSE), 2011, : 385 - 387

← 1 2 3 4 5 →