Performance Analysis of Machine Learning Algorithms on Imbalanced Datasets Using SMOTE Technique

被引:0
|
作者
Kumar, Bala Santhosh [1 ]
Yadav, Pasupula Praveen [1 ]
Prasad, P. Penchala [1 ]
机构
[1] G Pulla Reddy Engn Coll, Comp Sci & Engn Dept, Kurnool, India
关键词
Machine Learning; SMOTE; Accuracy;
D O I
10.1007/978-981-97-8031-0_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This research paper aims to investigate the impact of using the Synthetic Minority Over-Sampling Technique (SMOTE) on the performance of several machine learning algorithms on imbalanced dataset. Imbalanced datasets are a common problem in many real-world applications, where one class is much more prevalent than the other class. This imbalance can lead to biased models, where the majority class dominates the model's predictions, and the minority class is often misclassified. To address this problem, we applied the SMOTE algorithm to generate synthetic data for the minority class. We evaluated the performance of several popular machine learning algorithms including logistic regression, decision trees, ensemble learning, support vector machines, Neural networks and Auto ML approach on both the original imbalanced dataset and the SMOTE-augmented dataset. The experimental results demonstrate that using SMOTE significantly improves the accuracy of the machine learning algorithms on imbalanced datasets. In conclusion, our research highlights the importance of considering the impact of imbalanced datasets on machine learning algorithm's performance and demonstrates the effectiveness of SMOTE in addressing this issue. Our results can be useful to practitioners working on imbalanced datasets to choose an appropriate machine-learning algorithm and to decide whether to use SMOTE to improve their model's performance.
引用
收藏
页码:147 / 156
页数:10
相关论文
共 50 条
  • [1] Bi-SMOTE: a novel framework for handling imbalanced datasets using machine learning techniques
    Onima Tigga
    Jaya Pal
    Debjani Mustafi
    International Journal of Information Technology, 2025, 17 (1) : 431 - 445
  • [2] Learning imbalanced datasets based on SMOTE and Gaussian distribution
    Pan, Tingting
    Zhao, Junhong
    Wu, Wei
    Yang, Jie
    INFORMATION SCIENCES, 2020, 512 : 1214 - 1233
  • [3] Effective treatment of imbalanced datasets in health care using modified SMOTE coupled with stacked deep learning algorithms
    Sowjanya, A. Mary
    Mrudula, Owk
    APPLIED NANOSCIENCE, 2022, 13 (3) : 1829 - 1840
  • [4] Effective treatment of imbalanced datasets in health care using modified SMOTE coupled with stacked deep learning algorithms
    A. Mary Sowjanya
    Owk Mrudula
    Applied Nanoscience, 2023, 13 : 1829 - 1840
  • [5] Clustering Algorithms on Imbalanced Data Using the SMOTE Technique for Image Segmentation
    Abeysinghe, Wajira
    Hung, Chih-Cheng
    Bechikh, Slim
    Wang, Xiaosong
    Rattani, Altaf
    PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 17 - 22
  • [6] Enhancing classification performance in imbalanced datasets: A comparative analysis of machine learning models
    Dube, Lindani
    Verster, Tanja
    DATA SCIENCE IN FINANCE AND ECONOMICS, 2023, 3 (04): : 354 - 379
  • [7] Performance Analysis of Machine Learning Algorithms on Imbalanced DDoS Attack Dataset
    Deb, Dipok
    Rodrigo, Hansapani
    Kumar, Sanjeev
    2024 IEEE 5TH ANNUAL WORLD AI IOT CONGRESS, AIIOT 2024, 2024, : 0349 - 0355
  • [8] Impact of Nature of Medical Data on Machine and Deep Learning for Imbalanced Datasets: Clinical Validity of SMOTE Is Questionable
    Gholampour, Seifollah
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (02): : 827 - 841
  • [9] Exploring Early Prediction of Chronic Kidney Disease Using Machine Learning Algorithms for Small and Imbalanced Datasets
    da Silveira, Andressa C. M.
    Sobrinho, Alvaro
    da Silva, Leandro Dias
    Costa, Evandro de Barros
    Pinheiro, Maria Eliete
    Perkusich, Angelo
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [10] A theoretical distribution analysis of synthetic minority oversampling technique (SMOTE) for imbalanced learning
    Elreedy, Dina
    Atiya, Amir F.
    Kamalov, Firuz
    MACHINE LEARNING, 2024, 113 (07) : 4903 - 4923