Revisiting Code Smell Severity Prioritization using learning to rank techniques

被引：3

作者：

Liu, Lei ^{[1
]}

Lin, Guancheng ^{[2
]}

Zhu, Lin ^{[3
]}

Yang, Zhen ^{[4
]}

Song, Peilin ^{[1
]}

Wang, Xin ^{[5
]}

Hu, Wenhua ^{[6
]}

机构：

[1] Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Minist Educ, Key Lab Phys Elect & Devices, Xian 710049, Shaanxi, Peoples R China

[2] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan 430072, Hubei, Peoples R China

[3] Wuhan Qingchuan Univ, Sch Comp, Wuhan 430204, Hubei, Peoples R China

[4] Shandong Univ, Sch Comp Sci & Technol, Qingdao 266237, Shandong, Peoples R China

[5] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Hubei, Peoples R China

[6] Wuhan Univ Technol, Sch Comp Sci & Artificial Intelligence, Wuhan 430070, Hubei, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2024年 / 249卷

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Code smell; Severity prioritization; Learning to rank; Empirical study; COMPANY DEFECT PREDICTION;

D O I：

10.1016/j.eswa.2024.123483

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Code Smell Severity Prioritization (CSSP) is crucial in helping software developers minimize software maintenance costs and enhance software quality, particularly when faced with limited refactoring resources. Traditional code smell prioritization methods rely heavily on manual and semi -automatic approaches based on developer experience, often demanding considerable time and effort from experienced experts. Leveraging automated machine learning techniques can effectively overcome these limitations. However, most existing machine learning -based CSSP works have only considered limited pointwise Learning To Rank (LTR) algorithms and have used inappropriate metrics (e.g., Accuracy, Spearman, and MAE) to assess the performance of models. To address these limitations, we make a comprehensive comparison of 41 pointwise, 4 pairwise, and 4 listwise LTR algorithms for CSSP on four code smell severity datasets. Furthermore, we propose the adoption of Severity@20% and Cumulative Lift Chart (CLC) as the primary evaluation metrics to assess CSSP models more effectively. The results show that: (1) The ordinal Bagging (O -Bagging) algorithm demonstrates the highest performance for CSSP, achieving superior results in Severity@20% and CLC. (2) The ordinal classification method can help the top -performing base classification algorithms Bagging and XGBoost achieve better performance for CSSP tasks. (3) A higher (lower) Accuracy, higher (lower) Spearman, and lower (higher) MAE do not reliably indicate better (worse) performance for CSSP. This further underscores that Accuracy, Spearman, and MAE are unsuitable metrics for evaluating CSSP models' effectiveness. To summarize, our study suggest that developers employ the O -Bagging algorithm for CSSP, with Severity@20% and CLC serving as the primary evaluation metrics.

引用

页数：19

共 50 条

[31] Code smell prioritization in object-oriented software systems: A systematic literature review
Verma, Renu
Kumar, Kuldeep
Verma, Harsh K.
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2023, 35 (12)
[32] Learning to rank code examples for code search engines
Haoran Niu
Iman Keivanloo
Ying Zou
Empirical Software Engineering, 2017, 22 : 259 - 291
[33] Learning to rank code examples for code search engines
Niu, Haoran
Keivanloo, Iman
Zou, Ying
EMPIRICAL SOFTWARE ENGINEERING, 2017, 22 (01) : 259 - 291
[34] Boosting and Comparing Performance of Machine Learning Classifiers with Meta-heuristic Techniques to Detect Code Smell
Jain, Shivani
Saha, Anju
E-INFORMATICA SOFTWARE ENGINEERING JOURNAL, 2024, 18 (01)
[35] On the relative value of imbalanced learning for code smell detection
Li, Fuyang
Zou, Kuan
Keung, Jacky Wai
Yu, Xiao
Feng, Shuo
Xiao, Yan
SOFTWARE-PRACTICE & EXPERIENCE, 2023, 53 (10): : 1902 - 1927
[36] Software Code Analysis using Ensemble Learning Techniques
Aggarwal, Simran
PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION SCIENCE AND SYSTEM, AISS 2019, 2019,
[37] Revisiting Legacy Weight Relationships Using Machine Learning Techniques
Vegh, J. Michael
Milligan, Andrew J.
JOURNAL OF THE AMERICAN HELICOPTER SOCIETY, 2024, 69 (03)
[38] Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review
Ahmed Al-Shaaby
Hamoud Aljamaan
Mohammad Alshayeb
Arabian Journal for Science and Engineering, 2020, 45 : 2341 - 2369
[39] Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review
Al-Shaaby, Ahmed
Aljamaan, Hamoud
Alshayeb, Mohammad
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2020, 45 (04) : 2341 - 2369
[40] A study of dealing class imbalance problem with machine learning methods for code smell severity detection using PCA-based feature selection technique
Rao, Rajwant Singh
Dewangan, Seema
Mishra, Alok
Gupta, Manjari
SCIENTIFIC REPORTS, 2023, 13 (01)

← 1 2 3 4 5 →