Pre-trained Model Based Feature Envy Detection

被引:1
|
作者
Ma, Wenhao [1 ]
Yu, Yaoxiang [1 ]
Ruan, Xiaoming [1 ]
Cai, Bo [1 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Key Lab Aerosp Informat Secur & Trusted Comp, Minist Educ, Wuhan, Peoples R China
关键词
Feature Envy; Deep Learning; Software Refactoring; Pre-trained Model; Code Smell; CODE;
D O I
10.1109/MSR59073.2023.00065
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Code smells slow down software system development and makes them harder to maintain. Existing research aims to develop automatic detection algorithms to reduce the labor and time costs within the detection process. Deep learning techniques have recently been demonstrated to enhance the performance of recognizing code smells even more than metric-based heuristic detection algorithms. As large-scale pre-trained models for Programming Languages (PL), such as CodeT5, have lately achieved the top results in a variety of downstream tasks, some researchers begin to explore the use of pre-trained models to extract the contextual semantics of code to detect code smells. However, little research has employed contextual code semantics relationship between code snippets obtained by pre-trained models to identify code smells. In this paper, we investigate the use of the pretrained model CodeT5 to extract semantic relationships between code snippets to detect feature envy, which is one of the most common code smells. In addition, to investigate the performance of these semantic relationships extracted by pre-trained models of different architectures on detecting feature envy, we compare CodeT5 with two other pre-trained models CodeBERT and CodeGPT. We have performed our experimental evaluation on ten open-source projects, our approach improves F-measure by 29.32% on feature envy detection and 16.57% on moving destination recommendation. Using semantic relations extracted by several pre-trained models to detect feature envy outperforms the state-of-the-art. This shows that using this semantic relation to detect feature envy is promising. To enable future research on feature envy detection, we have made all the code and datasets utilized in this article open source.
引用
收藏
页码:430 / 440
页数:11
相关论文
共 50 条
  • [1] Software Vulnerabilities Detection Based on a Pre-trained Language Model
    Xu, Wenlin
    Li, Tong
    Wang, Jinsong
    Duan, Haibo
    Tang, Yahui
    [J]. 2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 904 - 911
  • [2] Continual Learning with Bayesian Model Based on a Fixed Pre-trained Feature Extractor
    Yang, Yang
    Cui, Zhiying
    Xu, Junjie
    Zhong, Changhong
    Wang, Ruixuan
    Zheng, Wei-Shi
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT V, 2021, 12905 : 397 - 406
  • [3] Continual learning with Bayesian model based on a fixed pre-trained feature extractor
    Yang Yang
    Zhiying Cui
    Junjie Xu
    Changhong Zhong
    Wei-Shi Zheng
    Ruixuan Wang
    [J]. Visual Intelligence, 1 (1):
  • [4] Data Augmentation Based on Pre-trained Language Model for Event Detection
    Zhang, Meng
    Xie, Zhiwen
    Liu, Jin
    [J]. CCKS 2021 - EVALUATION TRACK, 2022, 1553 : 59 - 68
  • [5] Detection of Chinese Deceptive Reviews Based on Pre-Trained Language Model
    Weng, Chia-Hsien
    Lin, Kuan-Cheng
    Ying, Jia-Ching
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [6] Pre-trained convolutional neural networks as feature extractors for tuberculosis detection
    Lopes, U. K.
    Valiati, J. F.
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2017, 89 : 135 - 143
  • [7] Feature Mixture on Pre-Trained Model for Few-Shot Learning
    Wang, Shuo
    Lu, Jinda
    Xu, Haiyang
    Hao, Yanbin
    He, Xiangnan
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 4104 - 4115
  • [8] Feature Unlearning for Pre-trained GANs and VAEs
    Moon, Saemi
    Cho, Seunghyuk
    Kim, Dongwoo
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21420 - 21428
  • [9] Chinese cyber-violent Speech Detection and Analysis Based on Pre-trained Model
    Zhou, Sunrui
    [J]. 2024 5TH INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKS AND INTERNET OF THINGS, CNIOT 2024, 2024, : 443 - 447
  • [10] Lithography Hotspot Detection Method Based on Pre-trained VGG11 Model
    Liao Lufeng
    Li Sikun
    Wang Xiangzhao
    [J]. ACTA OPTICA SINICA, 2023, 43 (03)