A study of dealing class imbalance problem with machine learning methods for code smell severity detection using PCA-based feature selection technique

被引:0
|
作者
Rajwant Singh Rao
Seema Dewangan
Alok Mishra
Manjari Gupta
机构
[1] Guru Ghasidas Vishwavidyalaya,Department of Computer Science and Information Technology
[2] Norwegian University of Science and Technology,Faculty of Engineering
[3] Banaras Hindu University,(Computer Science), DST
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Detecting code smells may be highly helpful for reducing maintenance costs and raising source code quality. Code smells facilitate developers or researchers to understand several types of design flaws. Code smells with high severity can cause significant problems for the software and may cause challenges for the system's maintainability. It is quite essential to assess the severity of the code smells detected in software, as it prioritizes refactoring efforts. The class imbalance problem also further enhances the difficulties in code smell severity detection. In this study, four code smell severity datasets (Data class, God class, Feature envy, and Long method) are selected to detect code smell severity. In this work, an effort is made to address the issue of class imbalance, for which, the Synthetic Minority Oversampling Technique (SMOTE) class balancing technique is applied. Each dataset's relevant features are chosen using a feature selection technique based on principal component analysis. The severity of code smells is determined using five machine learning techniques: K-nearest neighbor, Random forest, Decision tree, Multi-layer Perceptron, and Logistic Regression. This study obtained the 0.99 severity accuracy score with the Random forest and Decision tree approach with the Long method code smell. The model's performance is compared based on its accuracy and three other performance measurements (Precision, Recall, and F-measure) to estimate severity classification models. The impact of performance is also compared and presented with and without applying SMOTE. The results obtained in the study are promising and can be beneficial for paving the way for further studies in this area.
引用
收藏
相关论文
共 50 条
  • [31] Anomaly-Based Network Intrusion Detection System through Feature Selection and Hybrid Machine Learning Technique
    Pattawaro, Apichit
    Polprasert, Chantri
    2018 16TH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING (ICT&KE), 2018, : 64 - 69
  • [32] Identifying Effective Feature Selection Methods for Alzheimer's Disease Biomarker Gene Detection Using Machine Learning
    Alshamlan, Hala
    Omar, Samar
    Aljurayyad, Rehab
    Alabduljabbar, Reham
    DIAGNOSTICS, 2023, 13 (10)
  • [33] A Novel Wrapped Feature Selection Framework for Developing Power System Intrusion Detection Based on Machine Learning Methods
    Han, Yongming
    Wang, Yue
    Cao, Yuan
    Geng, Zhiqiang
    Zhu, Qunxiong
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (11): : 7066 - 7076
  • [34] An Aggregated Mutual Information Based Feature Selection with Machine Learning Methods for Enhancing IoT Botnet Attack Detection
    Al-Sarem, Mohammed
    Saeed, Faisal
    Alkhammash, Eman H.
    Alghamdi, Norah Saleh
    SENSORS, 2022, 22 (01)
  • [35] Cyberattack detection in wireless sensor networks using a hybrid feature reduction technique with AI and machine learning methods
    Mohamed H. Behiry
    Mohammed Aly
    Journal of Big Data, 11
  • [36] Cyberattack detection in wireless sensor networks using a hybrid feature reduction technique with AI and machine learning methods
    Behiry, Mohamed H.
    Aly, Mohammed
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [37] Robust machine learning based Intrusion detection system using simple statistical techniques in feature selection
    Kaushik, Sunil
    Bhardwaj, Akashdeep
    Almogren, Ahmad
    Bharany, Salil
    Altameem, Ayman
    Rehman, Ateeq Ur
    Hussen, Seada
    Hamam, Habib
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [38] Detection of colon cancer based on microarray dataset using machine learning as a feature selection and classification techniques
    A. S. M. Shafi
    M. M. Imran Molla
    Julakha Jahan Jui
    Mohammad Motiur Rahman
    SN Applied Sciences, 2020, 2
  • [39] Detection of colon cancer based on microarray dataset using machine learning as a feature selection and classification techniques
    Shafi, A. S. M.
    Molla, M. M. Imran
    Jui, Julakha Jahan
    Rahman, Mohammad Motiur
    SN APPLIED SCIENCES, 2020, 2 (07):
  • [40] Enhanced Feature Selection Using Genetic Algorithm for Machine-Learning-Based Phishing URL Detection
    Kocyigit, Emre
    Korkmaz, Mehmet
    Sahingoz, Ozgur Koray
    Diri, Banu
    APPLIED SCIENCES-BASEL, 2024, 14 (14):