Performance Analysis of Machine Learning Algorithms on Imbalanced Datasets Using SMOTE Technique

被引:0
|
作者
Kumar, Bala Santhosh [1 ]
Yadav, Pasupula Praveen [1 ]
Prasad, P. Penchala [1 ]
机构
[1] G Pulla Reddy Engn Coll, Comp Sci & Engn Dept, Kurnool, India
关键词
Machine Learning; SMOTE; Accuracy;
D O I
10.1007/978-981-97-8031-0_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This research paper aims to investigate the impact of using the Synthetic Minority Over-Sampling Technique (SMOTE) on the performance of several machine learning algorithms on imbalanced dataset. Imbalanced datasets are a common problem in many real-world applications, where one class is much more prevalent than the other class. This imbalance can lead to biased models, where the majority class dominates the model's predictions, and the minority class is often misclassified. To address this problem, we applied the SMOTE algorithm to generate synthetic data for the minority class. We evaluated the performance of several popular machine learning algorithms including logistic regression, decision trees, ensemble learning, support vector machines, Neural networks and Auto ML approach on both the original imbalanced dataset and the SMOTE-augmented dataset. The experimental results demonstrate that using SMOTE significantly improves the accuracy of the machine learning algorithms on imbalanced datasets. In conclusion, our research highlights the importance of considering the impact of imbalanced datasets on machine learning algorithm's performance and demonstrates the effectiveness of SMOTE in addressing this issue. Our results can be useful to practitioners working on imbalanced datasets to choose an appropriate machine-learning algorithm and to decide whether to use SMOTE to improve their model's performance.
引用
收藏
页码:147 / 156
页数:10
相关论文
共 50 条
  • [31] Performance Comparison of Machine Learning Algorithms for Imbalanced Class Classification in Hydraulic System
    Joo, Yohan
    Kim, Kyutae
    Jeong, Jongpil
    PROCEEDINGS OF THE 2020 14TH INTERNATIONAL CONFERENCE ON UBIQUITOUS INFORMATION MANAGEMENT AND COMMUNICATION (IMCOM), 2020,
  • [32] Disease Inference on Medical Datasets Using Machine Learning and Deep Learning Algorithms
    Chinnaswamy, Arunkumar
    Srinivasan, Ramakrishnan
    Gaurang, Desai Prutha
    COMPUTATIONAL VISION AND BIO-INSPIRED COMPUTING, 2020, 1108 : 902 - 908
  • [33] Machine Learning with Variational AutoEncoder for Imbalanced Datasets in Intrusion Detection
    Lin, Ying-Dar
    Liu, Zi-Qiang
    Hwang, Ren-Hung
    Nguyen, Van-Linh
    Lin, Po-Ching
    Lai, Yuan-Cheng
    IEEE Access, 2022, 10 : 15247 - 15260
  • [34] Financial Fraud Detection and Prediction in Listed Companies Using SMOTE and Machine Learning Algorithms
    Zhao, Zhihong
    Bai, Tongyuan
    ENTROPY, 2022, 24 (08)
  • [35] Robust predictive framework for diabetes classification using optimized machine learning on imbalanced datasets
    Abousaber, Inam
    Abdallah, Haitham F.
    El-Ghaish, Hany
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2025, 7
  • [36] Machine Learning With Variational AutoEncoder for Imbalanced Datasets in Intrusion Detection
    Lin, Ying-Dar
    Liu, Zi-Qiang
    Hwang, Ren-Hung
    Van-Linh Nguyen
    Lin, Po-Ching
    Lai, Yuan-Cheng
    IEEE ACCESS, 2022, 10 : 15247 - 15260
  • [37] Imbalanced data learning using SMOTE and deep learning architecture with optimized features
    Suja A. Alex
    Neural Computing and Applications, 2025, 37 (2) : 967 - 984
  • [38] Prediction of breast cancer using machine learning algorithms on different datasets
    Yavuz, Omer Cagri
    Calp, M. Hanefi
    Erkengel, Hazel Ceren
    INGENIERIA SOLIDARIA, 2023, 19 (01):
  • [39] Using Machine Learning to Predict Effective Compression Algorithms for Heterogeneous Datasets
    Burtchell, Brandon Alexander
    Burtscher, Martin
    2024 DATA COMPRESSION CONFERENCE, DCC, 2024, : 183 - 192
  • [40] Comparison of Machine Learning Algorithms on Different Datasets
    Uysal, Elif
    Ozturk, Ali
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,