Pre-trained Model Based Feature Envy Detection

被引：5

作者：

Ma, Wenhao ^{[1
]}

Yu, Yaoxiang ^{[1
]}

Ruan, Xiaoming ^{[1
]}

Cai, Bo ^{[1
]}

机构：

[1] Wuhan Univ, Sch Cyber Sci & Engn, Key Lab Aerosp Informat Secur & Trusted Comp, Minist Educ, Wuhan, Peoples R China

来源：

2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR | 2023年

关键词：

Feature Envy; Deep Learning; Software Refactoring; Pre-trained Model; Code Smell; CODE;

D O I：

10.1109/MSR59073.2023.00065

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Code smells slow down software system development and makes them harder to maintain. Existing research aims to develop automatic detection algorithms to reduce the labor and time costs within the detection process. Deep learning techniques have recently been demonstrated to enhance the performance of recognizing code smells even more than metric-based heuristic detection algorithms. As large-scale pre-trained models for Programming Languages (PL), such as CodeT5, have lately achieved the top results in a variety of downstream tasks, some researchers begin to explore the use of pre-trained models to extract the contextual semantics of code to detect code smells. However, little research has employed contextual code semantics relationship between code snippets obtained by pre-trained models to identify code smells. In this paper, we investigate the use of the pretrained model CodeT5 to extract semantic relationships between code snippets to detect feature envy, which is one of the most common code smells. In addition, to investigate the performance of these semantic relationships extracted by pre-trained models of different architectures on detecting feature envy, we compare CodeT5 with two other pre-trained models CodeBERT and CodeGPT. We have performed our experimental evaluation on ten open-source projects, our approach improves F-measure by 29.32% on feature envy detection and 16.57% on moving destination recommendation. Using semantic relations extracted by several pre-trained models to detect feature envy outperforms the state-of-the-art. This shows that using this semantic relation to detect feature envy is promising. To enable future research on feature envy detection, we have made all the code and datasets utilized in this article open source.

引用

页码：430 / 440

页数：11

共 50 条

[1] Software Vulnerabilities Detection Based on a Pre-trained Language Model
Xu, Wenlin
Li, Tong
Wang, Jinsong
Duan, Haibo
Tang, Yahui
2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 904 - 911
[2] Web-FTP: A Feature Transferring-Based Pre-Trained Model for Web Attack Detection
Guo, Zhenyu
Shang, Qinghua
Li, Xin
Li, Chengyi
Zhang, Zijian
Zhang, Zhuo
Hu, Jingjing
An, Jincheng
Huang, Chuanming
Chen, Yang
Cai, Yuguang
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (03) : 1495 - 1507
[3] Continual Learning with Bayesian Model Based on a Fixed Pre-trained Feature Extractor
Yang, Yang
Cui, Zhiying
Xu, Junjie
Zhong, Changhong
Wang, Ruixuan
Zheng, Wei-Shi
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT V, 2021, 12905 : 397 - 406
[4] Continual learning with Bayesian model based on a fixed pre-trained feature extractor
Yang Yang
Zhiying Cui
Junjie Xu
Changhong Zhong
Wei-Shi Zheng
Ruixuan Wang
Visual Intelligence, 1 (1):
[5] Data Augmentation Based on Pre-trained Language Model for Event Detection
Zhang, Meng
Xie, Zhiwen
Liu, Jin
CCKS 2021 - EVALUATION TRACK, 2022, 1553 : 59 - 68
[6] Detection of Chinese Deceptive Reviews Based on Pre-Trained Language Model
Weng, Chia-Hsien
Lin, Kuan-Cheng
Ying, Jia-Ching
APPLIED SCIENCES-BASEL, 2022, 12 (07):
[7] Pre-trained convolutional neural networks as feature extractors for tuberculosis detection
Lopes, U. K.
Valiati, J. F.
COMPUTERS IN BIOLOGY AND MEDICINE, 2017, 89 : 135 - 143
[8] Style Change Detection: Method Based On Pre-trained Model And Similarity Recognition
Foshan University, Foshan, China
CEUR Workshop Proc., (2526-2531):
[9] Detection of Unstructured Sensitive Data Based on a Pre-Trained Model and Lattice Transformer
Jin, Feng
Wu, Shaozhi
Liu, Xingang
Su, Han
Tian, Miao
2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA, ICAIBD 2024, 2024, : 180 - 185
[10] Feature Mixture on Pre-Trained Model for Few-Shot Learning
Wang, Shuo
Lu, Jinda
Xu, Haiyang
Hao, Yanbin
He, Xiangnan
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 4104 - 4115

← 1 2 3 4 5 →