How to Improve Deep Learning for Software Analytics (a case study with code smell detection)

被引:6
|
作者
Yedida, Rahul [1 ]
Menzies, Tim [1 ]
机构
[1] NC State Univ, Dept Comp Sci, Raleigh, NC 27695 USA
基金
美国国家科学基金会;
关键词
code smell detection; deep learning; autoencoders; NETWORKS;
D O I
10.1145/3524842.3528458
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To reduce technical debt and make code more maintainable, it is important to be able to warn programmers about code smells. State-of-the-art code small detectors use deep learners, usually without exploring alternatives. For example, one promising alternative is GHOST (from TSE'21) that relies on a combination of hyper-parameter optimization of feedforward neural networks and a novel oversampling technique. The prior study from TSE'21 proposing this novel "fuzzy sampling" was somewhat limited in that the method was tested on defect prediction, but nothing else. Like defect prediction, code smell detection datasets have a class imbalance (which motivated "fuzzy sampling"). Hence, in this work we test if fuzzy sampling is useful for code smell detection. The results of this paper show that we can achieve better than state-of-the-art results on code smell detection with fuzzy oversampling. For example, for "feature envy", we were able to achieve 99+% AUC across all our datasets, and on 8/10 datasets for "misplaced class". While our specific results refer to code smell detection, they do suggest other lessons for other kinds of analytics. For example: (a) try better preprocessing before trying complex learners (b) include simpler learners as a baseline in software analytics (c) try "fuzzy sampling" as one such baseline. In order to support others trying to reproduce/extend/refute this work, all our code and data is available online at https://github.com/yrahul3910/code-smell-detection.
引用
下载
收藏
页码:156 / 166
页数:11
相关论文
共 50 条
  • [1] Deep Learning Based Code Smell Detection
    Liu, Hui
    Jin, Jiahao
    Xu, Zhifeng
    Zou, Yanzhen
    Bu, Yifan
    Zhang, Lu
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2021, 47 (09) : 1811 - 1837
  • [3] The detection of code smell on software development: a mapping study
    Liu, Xinghua
    Zhang, Cheng
    PROCEEDINGS OF THE 2017 5TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND COMPUTING TECHNOLOGY (ICMMCT 2017), 2017, 126 : 560 - 575
  • [4] Application of Deep Learning for Code Smell Detection: Challenges and Opportunities
    Hadj-Kacem M.
    Bouassida N.
    SN Computer Science, 5 (5)
  • [5] Code smell detection by deep direct-learning and transfer-learning?
    Sharma, Tushar
    Efstathiou, Vasiliki
    Louridas, Panos
    Spinellis, Diomidis
    JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 176 (176)
  • [6] An Empirical Study on Vulnerability Detection for Source Code Software based on Deep Learning
    Lin, Wei
    Cai, Saihua
    2021 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C 2021), 2021, : 1159 - 1160
  • [7] DeleSmell: Code smell detection based on deep learning and latent semantic analysis
    Zhang, Yang
    Ge, Chuyan
    Hong, Shuai
    Tian, Ruili
    Dong, Chunhao
    Liu, Jingjing
    KNOWLEDGE-BASED SYSTEMS, 2022, 255
  • [8] Using Developers' Feedback to Improve Code Smell Detection
    Hozano, Mario
    Ferreira, Henrique
    Silva, Italo
    Fonseca, Baldoino
    Costa, Evandro
    30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 1661 - 1663
  • [9] Deep Learning for Software Vulnerabilities Detection Using Code Metrics
    Zagane, Mohammed
    Abdi, Mustapha Kamel
    Alenezi, Mamdouh
    IEEE ACCESS, 2020, 8 : 74562 - 74570
  • [10] A Semisupervised Learning Approach for Code Smell Detection
    Ishita Kheria
    Dhruv Gada
    Ruhina Karani
    SN Computer Science, 6 (2)