Pre-trained Model Based Feature Envy Detection

被引：1

作者：

Ma, Wenhao ^{[1
]}

Yu, Yaoxiang ^{[1
]}

Ruan, Xiaoming ^{[1
]}

Cai, Bo ^{[1
]}

机构：

[1] Wuhan Univ, Sch Cyber Sci & Engn, Key Lab Aerosp Informat Secur & Trusted Comp, Minist Educ, Wuhan, Peoples R China

来源：

2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR | 2023年

关键词：

Feature Envy; Deep Learning; Software Refactoring; Pre-trained Model; Code Smell; CODE;

D O I：

10.1109/MSR59073.2023.00065

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Code smells slow down software system development and makes them harder to maintain. Existing research aims to develop automatic detection algorithms to reduce the labor and time costs within the detection process. Deep learning techniques have recently been demonstrated to enhance the performance of recognizing code smells even more than metric-based heuristic detection algorithms. As large-scale pre-trained models for Programming Languages (PL), such as CodeT5, have lately achieved the top results in a variety of downstream tasks, some researchers begin to explore the use of pre-trained models to extract the contextual semantics of code to detect code smells. However, little research has employed contextual code semantics relationship between code snippets obtained by pre-trained models to identify code smells. In this paper, we investigate the use of the pretrained model CodeT5 to extract semantic relationships between code snippets to detect feature envy, which is one of the most common code smells. In addition, to investigate the performance of these semantic relationships extracted by pre-trained models of different architectures on detecting feature envy, we compare CodeT5 with two other pre-trained models CodeBERT and CodeGPT. We have performed our experimental evaluation on ten open-source projects, our approach improves F-measure by 29.32% on feature envy detection and 16.57% on moving destination recommendation. Using semantic relations extracted by several pre-trained models to detect feature envy outperforms the state-of-the-art. This shows that using this semantic relation to detect feature envy is promising. To enable future research on feature envy detection, we have made all the code and datasets utilized in this article open source.

引用

页码：430 / 440

页数：11

共 50 条

[21] BSTC: A Fake Review Detection Model Based on a Pre-Trained Language Model and Convolutional Neural Network
Lu, Junwen
Zhan, Xintao
Liu, Guanfeng
Zhan, Xinrong
Deng, Xiaolong
[J]. ELECTRONICS, 2023, 12 (10)
[22] Comparing Pre-Trained Language Model for Arabic Hate Speech Detection
Daouadi, Kheir Eddine
Boualleg, Yaakoub
Guehairia, Oussama
[J]. COMPUTACION Y SISTEMAS, 2024, 28 (02): : 681 - 693
[23] Interpretability of Entity Matching Based on Pre-trained Language Model
Liang, Zheng
Wang, Hong-Zhi
Dai, Jia-Jia
Shao, Xin-Yue
Ding, Xiao-Ou
Mu, Tian-Yu
[J]. Ruan Jian Xue Bao/Journal of Software, 2023, 34 (03): : 1087 - 1108
[24] Android Malware Detection Through a Pre-trained Model for Code Understanding
Garcia-Soto, Eva
Martin, Alejandro
Huertas-Tato, Javier
Camacho, David
[J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING & AMBIENT INTELLIGENCE (UCAMI 2022), 2023, 594 : 1055 - 1060
[25] Glomerulosclerosis detection with pre-trained CNNs ensemble
Santos, Justino
Silva, Romuere
Oliveira, Luciano
Santos, Washington
Aldeman, Nayze
Duarte, Angelo
Veras, Rodrigo
[J]. COMPUTATIONAL STATISTICS, 2024, 39 (02) : 561 - 581
[26] Glomerulosclerosis detection with pre-trained CNNs ensemble
Justino Santos
Romuere Silva
Luciano Oliveira
Washington Santos
Nayze Aldeman
Angelo Duarte
Rodrigo Veras
[J]. Computational Statistics, 2024, 39 : 561 - 581
[27] Smart Edge-based Fake News Detection using Pre-trained BERT Model
Guo, Yuhang
Lamaazi, Hanane
Mizouni, Rabeb
[J]. 2022 18TH INTERNATIONAL CONFERENCE ON WIRELESS AND MOBILE COMPUTING, NETWORKING AND COMMUNICATIONS (WIMOB), 2022,
[28] BERT-Log: Anomaly Detection for System Logs Based on Pre-trained Language Model
Chen, Song
Liao, Hai
[J]. APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
[29] Zero-Shot Out-of-Distribution Detection Based on the Pre-trained Model CLIP
Esmaeilpour, Sepideh
Liu, Bing
Robertson, Eric
Shu, Lei
[J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6568 - 6576
[30] An integrated model based on deep learning classifiers and pre-trained transformer for phishing URL detection
Do, Nguyet Quang
Selamat, Ali
Fujita, Hamido
Krejcar, Ondrej
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 161 : 269 - 285

← 1 2 3 4 5 →