Python']Python code smells detection using conventional machine learning models

被引:5
|
作者
Sandouka, Rana [1 ]
Aljamaan, Hamoud [1 ]
机构
[1] King Fahd Univ Petr & Minerals, Informat & Comp Sci Dept, Dhahran, Saudi Arabia
关键词
!text type='Python']Python[!/text; Code smell; Detection; Machine learning; Large class; Long method;
D O I
10.7717/peerj-cs.1370
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Code smells are poor code design or implementation that affect the code maintenance process and reduce the software quality. Therefore, code smell detection is important in software building. Recent studies utilized machine learning algorithms for code smell detection. However, most of these studies focused on code smell detection using Java programming language code smell datasets. This article proposes a Python code smell dataset for Large Class and Long Method code smells. The built dataset contains 1,000 samples for each code smell, with 18 features extracted from the source code. Furthermore, we investigated the detection performance of six machine learning models as baselines in Python code smells detection. The baselines were evaluated based on Accuracy and Matthews correlation coefficient (MCC) measures. Results indicate the superiority of Random Forest ensemble in Python Large Class code smell detection by achieving the highest detection performance of 0.77 MCC rate, while decision tree was the best performing model in Python Long Method code smell detection by achieving the highest MCC Rate of 0.89.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Python code smells detection using conventional machine learning models
    Sandouka, Rana
    Aljamaan, Hamoud
    [J]. PeerJ Computer Science, 2023, 9
  • [2] Machine Learning Techniques For Python']Python Source Code Vulnerability Detection
    Farasat, Talaya
    Posegga, Joachim
    [J]. PROCEEDINGS OF THE FOURTEENTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, CODASPY 2024, 2024, : 151 - 153
  • [3] Detecting Code Smells in Python']Python Programs
    Chen, Zhifei
    Chen, Lin
    Ma, Wanwangying
    Xu, Baowen
    [J]. 2016 INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, TESTING AND EVOLUTION (SATE 2016), 2016, : 18 - 23
  • [4] A Severity Assessment of Python']Python Code Smells
    Gupta, Aakanshi
    Gandhi, Rashmi
    Jatana, Nishtha
    Jatain, Divya
    Panda, Sandeep Kumar
    Ramesh, Janjhyam Venkata Naga
    [J]. IEEE ACCESS, 2023, 11 : 119146 - 119160
  • [5] The Raise of Machine Learning Hyperparameter Constraints in Python']Python Code
    Rak-amnouykit, Ingkarat
    Milanova, Ana
    Baudart, Guillaume
    Hirzel, Martin
    Dolby, Julian
    [J]. PROCEEDINGS OF THE 31ST ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2022, 2022, : 580 - 592
  • [6] An extensive study of the effects of different deep learning models on code vulnerability detection in Python']Python code
    Wang, Rongcun
    Xu, Senlei
    Ji, Xingyu
    Tian, Yuan
    Gong, Lina
    Wang, Ke
    [J]. AUTOMATED SOFTWARE ENGINEERING, 2024, 31 (01)
  • [7] Machine learning using Stata/Python']Python
    Cerulli, Giovanni
    [J]. STATA JOURNAL, 2022, 22 (04): : 772 - 810
  • [8] Raising a Model for Fake News Detection Using Machine Learning in Python']Python
    Rolong Agudelo, Gerardo Ernesto
    Salcedo Parra, Octavio Jose
    Baron Velandia, Julio
    [J]. CHALLENGES AND OPPORTUNITIES IN THE DIGITAL ERA, 2018, 11195 : 596 - 604
  • [9] Univariate machine learning models applied in photovoltaic power prediction using Python']Python
    Bahanni, Caouthar
    Mabrouki, Mustapha
    [J]. ENERGY SOURCES PART A-RECOVERY UTILIZATION AND ENVIRONMENTAL EFFECTS, 2023, 45 (01) : 589 - 607
  • [10] An Introduction to Machine Learning in Python']Python
    Clevert, D. -A.
    [J]. TOXICOLOGY LETTERS, 2023, 384 : S5 - S5