Pre-trained Model Based Feature Envy Detection

被引:1
|
作者
Ma, Wenhao [1 ]
Yu, Yaoxiang [1 ]
Ruan, Xiaoming [1 ]
Cai, Bo [1 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Key Lab Aerosp Informat Secur & Trusted Comp, Minist Educ, Wuhan, Peoples R China
关键词
Feature Envy; Deep Learning; Software Refactoring; Pre-trained Model; Code Smell; CODE;
D O I
10.1109/MSR59073.2023.00065
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code smells slow down software system development and makes them harder to maintain. Existing research aims to develop automatic detection algorithms to reduce the labor and time costs within the detection process. Deep learning techniques have recently been demonstrated to enhance the performance of recognizing code smells even more than metric-based heuristic detection algorithms. As large-scale pre-trained models for Programming Languages (PL), such as CodeT5, have lately achieved the top results in a variety of downstream tasks, some researchers begin to explore the use of pre-trained models to extract the contextual semantics of code to detect code smells. However, little research has employed contextual code semantics relationship between code snippets obtained by pre-trained models to identify code smells. In this paper, we investigate the use of the pretrained model CodeT5 to extract semantic relationships between code snippets to detect feature envy, which is one of the most common code smells. In addition, to investigate the performance of these semantic relationships extracted by pre-trained models of different architectures on detecting feature envy, we compare CodeT5 with two other pre-trained models CodeBERT and CodeGPT. We have performed our experimental evaluation on ten open-source projects, our approach improves F-measure by 29.32% on feature envy detection and 16.57% on moving destination recommendation. Using semantic relations extracted by several pre-trained models to detect feature envy outperforms the state-of-the-art. This shows that using this semantic relation to detect feature envy is promising. To enable future research on feature envy detection, we have made all the code and datasets utilized in this article open source.
引用
收藏
页码:430 / 440
页数:11
相关论文
共 50 条
  • [21] BSTC: A Fake Review Detection Model Based on a Pre-Trained Language Model and Convolutional Neural Network
    Lu, Junwen
    Zhan, Xintao
    Liu, Guanfeng
    Zhan, Xinrong
    Deng, Xiaolong
    [J]. ELECTRONICS, 2023, 12 (10)
  • [22] Comparing Pre-Trained Language Model for Arabic Hate Speech Detection
    Daouadi, Kheir Eddine
    Boualleg, Yaakoub
    Guehairia, Oussama
    [J]. COMPUTACION Y SISTEMAS, 2024, 28 (02): : 681 - 693
  • [23] Interpretability of Entity Matching Based on Pre-trained Language Model
    Liang, Zheng
    Wang, Hong-Zhi
    Dai, Jia-Jia
    Shao, Xin-Yue
    Ding, Xiao-Ou
    Mu, Tian-Yu
    [J]. Ruan Jian Xue Bao/Journal of Software, 2023, 34 (03): : 1087 - 1108
  • [24] Android Malware Detection Through a Pre-trained Model for Code Understanding
    Garcia-Soto, Eva
    Martin, Alejandro
    Huertas-Tato, Javier
    Camacho, David
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING & AMBIENT INTELLIGENCE (UCAMI 2022), 2023, 594 : 1055 - 1060
  • [25] Glomerulosclerosis detection with pre-trained CNNs ensemble
    Santos, Justino
    Silva, Romuere
    Oliveira, Luciano
    Santos, Washington
    Aldeman, Nayze
    Duarte, Angelo
    Veras, Rodrigo
    [J]. COMPUTATIONAL STATISTICS, 2024, 39 (02) : 561 - 581
  • [26] Glomerulosclerosis detection with pre-trained CNNs ensemble
    Justino Santos
    Romuere Silva
    Luciano Oliveira
    Washington Santos
    Nayze Aldeman
    Angelo Duarte
    Rodrigo Veras
    [J]. Computational Statistics, 2024, 39 : 561 - 581
  • [27] Smart Edge-based Fake News Detection using Pre-trained BERT Model
    Guo, Yuhang
    Lamaazi, Hanane
    Mizouni, Rabeb
    [J]. 2022 18TH INTERNATIONAL CONFERENCE ON WIRELESS AND MOBILE COMPUTING, NETWORKING AND COMMUNICATIONS (WIMOB), 2022,
  • [28] BERT-Log: Anomaly Detection for System Logs Based on Pre-trained Language Model
    Chen, Song
    Liao, Hai
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
  • [29] Zero-Shot Out-of-Distribution Detection Based on the Pre-trained Model CLIP
    Esmaeilpour, Sepideh
    Liu, Bing
    Robertson, Eric
    Shu, Lei
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6568 - 6576
  • [30] An integrated model based on deep learning classifiers and pre-trained transformer for phishing URL detection
    Do, Nguyet Quang
    Selamat, Ali
    Fujita, Hamido
    Krejcar, Ondrej
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 161 : 269 - 285