A study of dealing class imbalance problem with machine learning methods for code smell severity detection using PCA-based feature selection technique

被引:0
|
作者
Rajwant Singh Rao
Seema Dewangan
Alok Mishra
Manjari Gupta
机构
[1] Guru Ghasidas Vishwavidyalaya,Department of Computer Science and Information Technology
[2] Norwegian University of Science and Technology,Faculty of Engineering
[3] Banaras Hindu University,(Computer Science), DST
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Detecting code smells may be highly helpful for reducing maintenance costs and raising source code quality. Code smells facilitate developers or researchers to understand several types of design flaws. Code smells with high severity can cause significant problems for the software and may cause challenges for the system's maintainability. It is quite essential to assess the severity of the code smells detected in software, as it prioritizes refactoring efforts. The class imbalance problem also further enhances the difficulties in code smell severity detection. In this study, four code smell severity datasets (Data class, God class, Feature envy, and Long method) are selected to detect code smell severity. In this work, an effort is made to address the issue of class imbalance, for which, the Synthetic Minority Oversampling Technique (SMOTE) class balancing technique is applied. Each dataset's relevant features are chosen using a feature selection technique based on principal component analysis. The severity of code smells is determined using five machine learning techniques: K-nearest neighbor, Random forest, Decision tree, Multi-layer Perceptron, and Logistic Regression. This study obtained the 0.99 severity accuracy score with the Random forest and Decision tree approach with the Long method code smell. The model's performance is compared based on its accuracy and three other performance measurements (Precision, Recall, and F-measure) to estimate severity classification models. The impact of performance is also compared and presented with and without applying SMOTE. The results obtained in the study are promising and can be beneficial for paving the way for further studies in this area.
引用
收藏
相关论文
共 50 条
  • [41] Enhanced mastitis severity classification in dairy cows using DNN and RF: A study on PCA and correlation-based feature selection
    Lashin, Manar
    Farid, Ayman Samir
    Elgammal, Abdullah T.
    SMART AGRICULTURAL TECHNOLOGY, 2024, 9
  • [42] RETRACTED: Classification and Detection of Mesothelioma Cancer Using Feature Selection-Enabled Machine Learning Technique (Retracted Article)
    Shobana, M.
    Balasraswathi, V. R.
    Radhika, R.
    Oleiwi, Ahmed Kareem
    Chaudhury, Sushovan
    Ladkat, Ajay S.
    Naved, Mohd
    Rahmani, Abdul Wahab
    BIOMED RESEARCH INTERNATIONAL, 2022, 2022
  • [43] Novel Framework for an Intrusion Detection System Using Multiple Feature Selection Methods Based on Deep Learning
    Eljialy, A. E. M.
    Uddin, Mohammed Yousuf
    Ahmad, Sultan
    TSINGHUA SCIENCE AND TECHNOLOGY, 2024, 29 (04): : 948 - 958
  • [44] Lung Cancer Prediction Using Stochastic Diffusion Search (SDS) Based Feature Selection and Machine Learning Methods
    S. Shanthi
    N. Rajkumar
    Neural Processing Letters, 2021, 53 : 2617 - 2630
  • [45] Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods
    Chang, Siow-Wee
    Abdul-Kareem, Sameem
    Merican, Amir Feisal
    Zain, Rosnah Binti
    BMC BIOINFORMATICS, 2013, 14
  • [46] Lung Cancer Prediction Using Stochastic Diffusion Search (SDS) Based Feature Selection and Machine Learning Methods
    Shanthi, S.
    Rajkumar, N.
    NEURAL PROCESSING LETTERS, 2021, 53 (04) : 2617 - 2630
  • [47] Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods
    Siow-Wee Chang
    Sameem Abdul-Kareem
    Amir Feisal Merican
    Rosnah Binti Zain
    BMC Bioinformatics, 14
  • [48] Feature Selection-Based Evaluation for Network Intrusion Detection System with Machine Learning Methods on CICIDS2017
    Upadhyay, Lav
    Tripathi, Meenakshi
    Grover, Jyoti
    COMMUNICATION AND INTELLIGENT SYSTEMS, VOL 3, ICCIS 2023, 2024, 969 : 345 - 356
  • [49] MULTI-CLASS CREVASSE DETECTION USING GROUND PENETRATING RADAR AND FEATURE-BASED MACHINE LEARNING
    Walker, Benjamin
    Ray, Laura
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 3578 - 3581
  • [50] A New Feature Selection Method Based on Dragonfly Algorithm for Android Malware Detection Using Machine Learning Techniques
    Guendouz, Mohamed
    Amine, Abdelmalek
    INTERNATIONAL JOURNAL OF INFORMATION SECURITY AND PRIVACY, 2023, 17 (01)