The Use of Summation to Aggregate Software Metrics Hinders the Performance of Defect Prediction Models

被引:54
|
作者
Zhang, Feng [1 ]
Hassan, Ahmed E. [1 ]
McIntosh, Shane [2 ]
Zou, Ying [3 ]
机构
[1] Queens Univ, Sch Comp, Kingston, ON K7L 3N6, Canada
[2] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ H3A 0G4, Canada
[3] Queens Univ, Dept Elect & Comp Engn, Kingston, ON K7L 3N6, Canada
关键词
Defect prediction; aggregation scheme; software metrics; OBJECT-ORIENTED METRICS; EMPIRICAL-ANALYSIS; FAULTS; CODE; COMPLEXITY;
D O I
10.1109/TSE.2016.2599161
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Defect prediction models help software organizations to anticipate where defects will appear in the future. When training a defect prediction model, historical defect data is often mined from a Version Control System (VCS, e.g., Subversion), which records software changes at the file-level. Software metrics, on the other hand, are often calculated at the class-or method-level (e.g., McCabe's Cyclomatic Complexity). To address the disagreement in granularity, the class-and method-level software metrics are aggregated to file-level, often using summation (i.e., McCabe of a file is the sum of the McCabe of all methods within the file). A recent study shows that summation significantly inflates the correlation between lines of code (SLOC) and cyclomatic complexity (CC) in Java projects. While there are many other aggregation schemes (e.g., central tendency, dispersion), they have remained unexplored in the scope of defect prediction. In this study, we set out to investigate how different aggregation schemes impact defect prediction models. Through an analysis of 11 aggregation schemes using data collected from 255 open source projects, we find that: (1) aggregation schemes can significantly alter correlations among metrics, as well as the correlations between metrics and the defect count; (2) when constructing models to predict defect proneness, applying only the summation scheme (i.e., the most commonly used aggregation scheme in the literature) only achieves the best performance (the best among the 12 studied configurations) in 11 percent of the studied projects, while applying all of the studied aggregation schemes achieves the best performance in 40 percent of the studied projects; (3) when constructing models to predict defect rank or count, either applying only the summation or applying all of the studied aggregation schemes achieves similar performance, with both achieving the closest to the best performance more often than the other studied aggregation schemes; and (4) when constructing models for effort-aware defect prediction, the mean or median aggregation schemes yield performance values that are significantly closer to the best performance than any of the other studied aggregation schemes. Broadly speaking, the performance of defect prediction models are often underestimated due to our community's tendency to only use the summation aggregation scheme. Given the potential benefit of applying additional aggregation schemes, we advise that future defect prediction models should explore a variety of aggregation schemes.
引用
收藏
页码:476 / 491
页数:16
相关论文
共 50 条
  • [1] The impact of using biased performance metrics on software defect prediction research
    Yao, Jingxiu
    Shepperd, Martin
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2021, 139
  • [2] Software Defect Prediction Using Software Metrics - A survey
    Punitha, K.
    Chitra, S.
    [J]. 2013 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2013, : 555 - 558
  • [3] Investigating the performance of personalized models for software defect prediction
    Eken, Beyza
    Tosun, Ayse
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 181
  • [4] A Study on the Significance of Software Metrics in Defect Prediction
    Xia, Ye
    Yan, Guoying
    Si, Qianran
    [J]. 2013 SIXTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2013, : 343 - 346
  • [5] Periodic Developer Metrics in Software Defect Prediction
    Kini, Seldag Ozcan
    Tosun, Ayse
    [J]. 2018 IEEE 18TH INTERNATIONAL WORKING CONFERENCE ON SOURCE CODE ANALYSIS AND MANIPULATION (SCAM), 2018, : 72 - 81
  • [6] The Stability of Threshold Values for Software Metrics in Software Defect Prediction
    Mausa, Goran
    Grbac, Tihana Galinac
    [J]. MODEL AND DATA ENGINEERING (MEDI 2017), 2017, 10563 : 81 - 95
  • [7] Effective Estimation of Modules' Metrics in Software Defect Prediction
    Fakhrahmad, S. M.
    Sami, A.
    [J]. WORLD CONGRESS ON ENGINEERING 2009, VOLS I AND II, 2009, : 206 - 211
  • [8] A New Metrics Selection Method for Software Defect Prediction
    Xia, Ye
    Yan, Guoying
    Jiang, Xingwei
    Yang, Yanyan
    [J]. PROCEEDINGS OF 2014 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), 2014, : 433 - 436
  • [9] Developer Micro Interaction Metrics for Software Defect Prediction
    Lee, Taek
    Nam, Jaechang
    Han, Donggyun
    Kim, Sunghun
    In, Hoh Peter
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2016, 42 (11) : 1015 - 1035
  • [10] Improving Software Defect Prediction by Aggregated Change Metrics
    Sikic, Lucija
    Afric, Petar
    Kurdija, Adrian Satja
    Silic, Marin
    [J]. IEEE ACCESS, 2021, 9 : 19391 - 19411