An In-Depth Study of the Potentially Confounding Effect of Class Size in Fault Prediction

被引：58

作者：

Zhou, Yuming ^{[1
]}

Xu, Baowen ^{[1
]}

Leung, Hareton ^{[2
]}

Chen, Lin ^{[1
]}

机构：

[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China

[2] Hong Kong Polytech Univ, Hong Kong, Hong Kong, Peoples R China

来源：

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY | 2014年 / 23卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Measurement; Theory; Metrics; class size; confounding effect; fault; prediction; ORIENTED DESIGN METRICS; DEFECT-PRONE CLASSES; EMPIRICAL VALIDATION; SOFTWARE QUALITY; QUANTITATIVE-ANALYSIS; COHESION METRICS; MEDIATION; COMPLEXITY; MODELS; CODE;

D O I：

10.1145/2556777

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Background. The extent of the potentially confounding effect of class size in the fault prediction context is not clear, nor is the method to remove the potentially confounding effect, or the influence of this removal on the performance of fault-proneness prediction models. Objective. We aim to provide an in-depth understanding of the effect of class size on the true associations between object-oriented metrics and fault-proneness. Method. We first employ statistical methods to examine the extent of the potentially confounding effect of class size in the fault prediction context. After that, we propose a linear regression-based method to remove the potentially confounding effect. Finally, we empirically investigate whether this removal could improve the prediction performance of fault-proneness prediction models. Results. Based on open-source software systems, we found: (a) the confounding effect of class size on the associations between object-oriented metrics and fault-proneness in general exists; (b) the proposed linear regression-based method can effectively remove the confounding effect; and (c) after removing the confounding effect, the prediction performance of fault prediction models with respect to both ranking and classification can in general be significantly improved. Conclusion. We should remove the confounding effect of class size when building fault prediction models.

引用

页数：51

共 50 条

[1] Does class size matter? An in-depth assessment of the effect of class size in software defect prediction
Amjed Tahir
Kwabena E. Bennin
Xun Xiao
Stephen G. MacDonell
[J]. Empirical Software Engineering, 2021, 26
[2] Does class size matter? An in-depth assessment of the effect of class size in software defect prediction
Tahir, Amjed
Bennin, Kwabena E.
Xiao, Xun
MacDonell, Stephen G.
[J]. EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (05)
[3] Examining the Potentially Confounding Effect of Class Size on the Associations between Object-Oriented Metrics and Change-Proneness
Zhou, Yuming
Leung, Hareton
Xu, Baowen
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2009, 35 (05) : 607 - 623
[4] EFFECT OF CLASS SIZE ON IDENTIFICATION OF POTENTIALLY DISTURBED CHILDREN
SALVIA, JA
SCHULTZ, EW
CHAPIN, NS
[J]. EXCEPTIONAL CHILDREN, 1974, 40 (07) : 517 - 519
[5] The confounding effect of class size on the validity of object-oriented metrics
Emam, KE
Benlarbi, S
Goel, N
Rai, SN
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2001, 27 (07) : 630 - 650
[6] The confounding effect of class size on the validity of object-oriented metrics
Evanco, WM
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2003, 29 (07) : 670 - 672
[7] An In-Depth Study of the Efficiency of Risk Evaluation Formulas for Multi-Fault Localization
Ju, Xiaolin
Chen, Xiang
Yang, Yibiao
Jiang, Shujuan
Qian, Junyan
Xu, Baowen
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C), 2017, : 304 - 310
[8] Exploring Error Bits for Memory Failure Prediction: An In-Depth Correlative Study
Yu, Qiao
Zhang, Wengui
Cardoso, Jorge
Kao, Odej
[J]. 2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
[9] Revisiting the size effect in software fault prediction models
Tahir, Amjed
Bennin, Kwabena E.
MacDonell, Stephen G.
Marsland, Stephen
[J]. PROCEEDINGS OF THE 12TH ACM/IEEE INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING AND MEASUREMENT (ESEM 2018), 2018,
[10] Location prediction in large-scale social networks: an in-depth benchmarking study
Nur Al Hasan Haldar
Jianxin Li
Mark Reynolds
Timos Sellis
Jeffrey Xu Yu
[J]. The VLDB Journal, 2019, 28 : 623 - 648

← 1 2 3 4 5 →