Experimental Comparison of Sampling Techniques for Imbalanced Datasets Using Various Classification Models

被引:5
|
作者
Pattanayak, Sanjibani Sudha [1 ]
Rout, Minakhi [1 ]
机构
[1] Siksha O Anusandhan Univ, ITER, Bhubaneswar 751030, Odisha, India
关键词
Sampling techniques; SMOTE; MWMOTE; SVM; RBF; MLP; SMOTE;
D O I
10.1007/978-981-10-6875-1_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imbalanced dataset is a dataset, in which the number of samples in different classes is highly uneven, which makes it very challenging for classification, i.e., classification becomes very tough as the result may get biased by the dominating class values. But misclassification of minor class sample or interested samples is very much costlier. So to provide solution to this problem, various studies have been made out of which sampling techniques are successfully adopted to preprocess the imbalance datasets. In this paper, experimental comparison of two pioneering sampling techniques SMOTE and MWMOTE is simulated using the classification models SVM, RBF, and MLP.
引用
收藏
页码:13 / 22
页数:10
相关论文
共 50 条
  • [31] Software fault prediction with imbalanced datasets using SMOTE-Tomek sampling technique and Genetic Algorithm models
    Mansi Gupta
    Kumar Rajnish
    Vandana Bhattacharjee
    Multimedia Tools and Applications, 2024, 83 : 47627 - 47648
  • [32] An experimental study of sentiment classification using deep-based models with various word embedding techniques
    Rezaei, Sajad
    Tanha, Jafar
    Roshan, Seyedehsan
    Jafari, Zahra
    Molaei, Mahdi
    Mirzadoust, Samira
    Sadeghi, Mohammad
    Forsati, Amir
    Khoshamouz, Tara
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2024,
  • [33] An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets
    Kovacs, Gyorgy
    APPLIED SOFT COMPUTING, 2019, 83
  • [34] Sparse Matrix Classification on Imbalanced Datasets Using Convolutional Neural Networks
    Pichel, Juan C.
    Pateiro-Lopez, Beatriz
    IEEE ACCESS, 2019, 7 : 82377 - 82389
  • [35] Multi-classification of arrhythmias using a HCRNet on imbalanced ECG datasets
    Luo, Xinyu
    Yang, Liuyang
    Cai, Hongyu
    Tang, Rui
    Chen, Yu
    Li, Wei
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2021, 208
  • [36] Classification of Imbalanced Datasets using Partition Method and Support Vector Machine
    Awasare, Vinod Kumar
    Gupta, Surendra
    PROCEEDINGS OF THE 2017 IEEE SECOND INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES (ICECCT), 2017,
  • [37] A Noisy-sample-removed Under-sampling Scheme for Imbalanced Classification of Public Datasets
    Zhu, Honghao
    Liu, Guanjun
    Zhou, Mengchu
    Xie, Yu
    Kang, Qi
    IFAC PAPERSONLINE, 2020, 53 (05): : 624 - 629
  • [38] Machine Learning with Imbalanced EEG Datasets using Outlier-based Sampling
    Islah, Nizar
    Koerner, Jamie
    Genov, Roman
    Valiante, Taufik A.
    O'Leary, Gerard
    42ND ANNUAL INTERNATIONAL CONFERENCES OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY: ENABLING INNOVATIVE TECHNOLOGIES FOR GLOBAL HEALTHCARE EMBC'20, 2020, : 112 - 115
  • [39] Using Evolutionary Multiobjective Techniques for Imbalanced Classification Data
    Garcia, Sandra
    Aler, Ricardo
    Maria Galvan, Ines
    ARTIFICIAL NEURAL NETWORKS-ICANN 2010, PT I, 2010, 6352 : 422 - 427
  • [40] Enriched Over-Sampling Techniques for Improving Classification of Imbalanced Big Data
    Patil, Sachin Subhash
    Sonavane, Shefali Pratap
    2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), 2017, : 1 - 10