Machine Learning Techniques for Code Smells Detection: A Systematic Mapping Study

被引:30
|
作者
Caram, Frederico Luiz [1 ]
De Oliveira Rodrigues, Bruno Rafael [1 ]
Campanelli, Amadeu Silveira [1 ]
Parreiras, Fernando Silva [1 ]
机构
[1] FUMEC Univ, LAIS Lab Adv Informat Syst, Av Afonso Pena 3880, BR-30130009 Belo Horizonte, MG, Brazil
关键词
Machine learning; code smells; refactoring; BAD SMELLS; REFACTORING OPPORTUNITIES; EVOLUTION;
D O I
10.1142/S021819401950013X
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Code smells or bad smells are an accepted approach to identify design flaws in the source code. Although it has been explored by researchers, the interpretation of programmers is rather subjective. One way to deal with this subjectivity is to use machine learning techniques. This paper provides the reader with an overview of machine learning techniques and code smells found in the literature, aiming at determining which methods and practices are used when applying machine learning for code smells identification and which machine learning techniques have been used for code smells identification. A mapping study was used to identify the techniques used for each smell. We found that the Bloaters was the main kind of smell studied, addressed by 35% of the papers. The most commonly used technique was Genetic Algorithms (GA), used by 22.22% of the papers. Regarding the smells addressed by each technique, there was a high level of redundancy, in a way that the smells are covered by a wide range of algorithms. Nevertheless, Feature Envy stood out, being targeted by 63% of the techniques. When it comes to performance, the best average was provided by Decision Tree, followed by Random Forest, Semi-supervised and Support Vector Machine Classifier techniques. 5 out of the 25 analyzed smells were not handled by any machine learning techniques. Most of them focus on several code smells and in general there is no outperforming technique, except for a few specific smells. We also found a lack of comparable results due to the heterogeneity of the data sources and of the provided results. We recommend the pursuit of further empirical studies to assess the performance of these techniques in a standardized dataset to improve the comparison reliability and replicability.
引用
收藏
页码:285 / 316
页数:32
相关论文
共 50 条
  • [1] Machine learning techniques for code smells detection: an empirical experiment on a highly imbalanced setup
    Luiz, Frederico Caram
    de Oliveira Rodrigues, Bruno Rafael
    Parreiras, Fernando Silva
    [J]. PROCEEDINGS OF THE XV BRAZILIAN SYMPOSIUM ON INFORMATION SYSTEMS, SBSI 2019: Complexity on Modern Information Systems, 2019,
  • [2] Improving accuracy of code smells detection using machine learning with data balancing techniques
    Khleel, Nasraldeen Alnor Adam
    Nehez, Karoly
    [J]. JOURNAL OF SUPERCOMPUTING, 2024, 80 (14): : 21048 - 21093
  • [3] Severity classification of software code smells using machine learning techniques: A comparative study
    Abdou, Ashraf
    Darwish, Nagy
    [J]. JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2024, 36 (01)
  • [4] A systematic mapping study on architectural smells detection
    Mumtaz, Haris
    Singh, Paramvir
    Blincoe, Kelly
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 173
  • [5] Detecting Code Smells using Machine Learning Techniques: Are We There Yet?
    Di Nucci, Dario
    Palomba, Fabio
    Tamburri, Damian A.
    Serebrenik, Alexander
    De Lucia, Andrea
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2018), 2018, : 612 - 621
  • [6] Python code smells detection using conventional machine learning models
    Sandouka, Rana
    Aljamaan, Hamoud
    [J]. PeerJ Computer Science, 2023, 9
  • [7] Machine Learning Techniques in Optical Networks: A Systematic Mapping Study
    Villa, Genesis
    Tipantuna, Christian
    Guaman, Danny S.
    Arevalo, German V.
    Arguero, Berenice
    [J]. IEEE ACCESS, 2023, 11 : 98714 - 98750
  • [8] Advanced Machine Learning techniques for fake news (online disinformation) detection: A systematic mapping study
    Choras, Michal
    Demestichas, Konstantinos
    Gielczyk, Agata
    Herrero, Alvaro
    Ksieniewicz, Pawel
    Remoundou, Konstantina
    Urda, Daniel
    Wozniak, Michal
    [J]. APPLIED SOFT COMPUTING, 2021, 101
  • [9] Python']Python code smells detection using conventional machine learning models
    Sandouka, Rana
    Aljamaan, Hamoud
    [J]. PEERJ COMPUTER SCIENCE, 2023, 9
  • [10] Code Smells Enabled by Artificial Intelligence: A Systematic Mapping
    Zaidi, Moayid Ali
    Colomo-Palacios, Ricardo
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2019, PT IV, 2019, 11622 : 418 - 427