The limited impact of individual developer data on software defect prediction

被引:0
|
作者
Robert M. Bell
Thomas J. Ostrand
Elaine J. Weyuker
机构
[1] AT&T Labs - Research,
来源
关键词
Software faults; Buggy file ratio; Fault-prone; Prediction; Fault-percentile average; Regression model; Empirical study; D.2.5;
D O I
暂无
中图分类号
学科分类号
摘要
Previous research has provided evidence that a combination of static code metrics and software history metrics can be used to predict with surprising success which files in the next release of a large system will have the largest numbers of defects. In contrast, very little research exists to indicate whether information about individual developers can profitably be used to improve predictions. We investigate whether files in a large system that are modified by an individual developer consistently contain either more or fewer faults than the average of all files in the system. The goal of the investigation is to determine whether information about which particular developer modified a file is able to improve defect predictions. We also extend earlier research evaluating use of counts of the number of developers who modified a file as predictors of the file’s future faultiness. We analyze change reports filed for three large systems, each containing 18 releases, with a combined total of nearly 4 million LOC and over 11,000 files. A buggy file ratio is defined for programmers, measuring the proportion of faulty files in Release R out of all files modified by the programmer in Release R-1. We assess the consistency of the buggy file ratio across releases for individual programmers both visually and within the context of a fault prediction model. Buggy file ratios for individual programmers often varied widely across all the releases that they participated in. A prediction model that takes account of the history of faulty files that were changed by individual developers shows improvement over the standard negative binomial model of less than 0.13% according to one measure, and no improvement at all according to another measure. In contrast, augmenting a standard model with counts of cumulative developers changing files in prior releases produced up to a 2% improvement in the percentage of faults detected in the top 20% of predicted faulty files. The cumulative number of developers interacting with a file can be a useful variable for defect prediction. However, the study indicates that adding information to a model about which particular developer modified a file is not likely to improve defect predictions.
引用
收藏
页码:478 / 505
页数:27
相关论文
共 50 条
  • [1] The limited impact of individual developer data on software defect prediction
    Bell, Robert M.
    Ostrand, Thomas J.
    Weyuker, Elaine J.
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2013, 18 (03) : 478 - 505
  • [2] Periodic Developer Metrics in Software Defect Prediction
    Kini, Seldag Ozcan
    Tosun, Ayse
    [J]. 2018 IEEE 18TH INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM), 2018, : 72 - 81
  • [3] Developer Micro Interaction Metrics for Software Defect Prediction
    Lee, Taek
    Nam, Jaechang
    Han, Donggyun
    Kim, Sunghun
    In, Hoh Peter
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2016, 42 (11) : 1015 - 1035
  • [4] Impact of Data Sampling on Feature Selection Techniques for Software Defect Prediction
    Gao, Kehan
    Khoshgoftaar, Taghi M.
    Napolitano, Amri
    [J]. PROCEEDINGS 18TH ISSAT INTERNATIONAL CONFERENCE ON RELIABILITY & QUALITY IN DESIGN, 2012, : 91 - +
  • [5] Software Defect Prediction with Skewed Data
    Seliya, Naeem
    Khoshgoftaar, Taghi M.
    [J]. 16TH ISSAT INTERNATIONAL CONFERENCE ON RELIABILITY AND QUALITY IN DESIGN, 2010, : 403 - +
  • [6] Impact of the Distribution Parameter of Data Sampling Approaches on Software Defect Prediction Models
    Bennin, Kwabena Ebo
    Keung, Jacky
    Monden, Akito
    [J]. 2017 24TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2017), 2017, : 630 - 635
  • [7] Impact of Types of Change on Software Defect Prediction
    Erdem, Atakan
    [J]. INTELLIGENT COMPUTING, VOL 2, 2021, 284 : 273 - 283
  • [8] Evaluating the Impact of Data Transformation Techniques on the Performance and Interpretability of Software Defect Prediction Models
    Zhao, Yu
    Huang, Zhiqiu
    Gong, Lina
    Zhu, Yi
    Yu, Qiao
    Gao, Yuxiang
    [J]. IET SOFTWARE, 2023, 2023
  • [9] Early Software Defect Prediction: Right-Shifting Software Effort Data into a Defect Curve
    Okumoto, Kazuhira
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING WORKSHOPS (ISSREW 2022), 2022, : 43 - 48
  • [10] Software Defect Prediction Based on Stability Test Data
    Okumoto, Kazu
    [J]. 2011 INTERNATIONAL CONFERENCE ON QUALITY, RELIABILITY, RISK, MAINTENANCE, AND SAFETY ENGINEERING (ICQR2MSE), 2011, : 385 - 387