Experimental Comparison of Sampling Techniques for Imbalanced Datasets Using Various Classification Models

被引:5
|
作者
Pattanayak, Sanjibani Sudha [1 ]
Rout, Minakhi [1 ]
机构
[1] Siksha O Anusandhan Univ, ITER, Bhubaneswar 751030, Odisha, India
关键词
Sampling techniques; SMOTE; MWMOTE; SVM; RBF; MLP; SMOTE;
D O I
10.1007/978-981-10-6875-1_2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imbalanced dataset is a dataset, in which the number of samples in different classes is highly uneven, which makes it very challenging for classification, i.e., classification becomes very tough as the result may get biased by the dominating class values. But misclassification of minor class sample or interested samples is very much costlier. So to provide solution to this problem, various studies have been made out of which sampling techniques are successfully adopted to preprocess the imbalance datasets. In this paper, experimental comparison of two pioneering sampling techniques SMOTE and MWMOTE is simulated using the classification models SVM, RBF, and MLP.
引用
收藏
页码:13 / 22
页数:10
相关论文
共 50 条
  • [41] Imbalanced data classification: Using transfer learning and active sampling
    Liu, Yang
    Yang, Guoping
    Qiao, Shaojie
    Liu, Meiqi
    Qu, Lulu
    Han, Nan
    Wu, Tao
    Yuan, Guan
    Peng, Yuzhong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [42] Comparison of machine learning techniques to handle imbalanced COVID-19 CBC datasets
    Dorn, Marcio
    Grisci, Bruno Iochins
    Narloch, Pedro Henrique
    Feltes, Bruno Cesar
    Avila, Eduardo
    Kahmann, Alessandro
    Alho, Clarice Sampaio
    PEERJ COMPUTER SCIENCE, 2021, 7 : 1 - 34
  • [43] Using Experimental Design to Determine the Re-Sampling Strategy for Developing a Classification Model for Imbalanced Data
    Tong, Lee-Ing
    Chang, Yung-Chia
    Lin, Shan-Hui
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON INFORMATION AND MANAGEMENT SCIENCES, 2009, 8 : 646 - 648
  • [44] Hyperspectral Imbalanced Datasets Classification Using Filter-Based Forest Methods
    Khosravi, Iman
    Jouybari-Moghaddam, Yaser
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2019, 12 (12) : 4766 - 4772
  • [45] Three-Stage Sampling Algorithm for Highly Imbalanced Multi-Classification Time Series Datasets
    Wang, Haoming
    SYMMETRY-BASEL, 2023, 15 (10):
  • [46] Fuzzy Aggregation for Rule Selection in Imbalanced Datasets Classification using Choquet Integral
    Abdellatif, Safa
    Ben Hassine, Mohamed Ali
    Ben Yahia, Sadok
    Bouzeghoub, Amel
    2018 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2018,
  • [47] Binary classification for imbalanced datasets using twin hyperspheres based on conformal method
    Zheng, Jian
    Li, Lin
    Wang, Shiyan
    Yan, Huyong
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (08): : 11299 - 11315
  • [48] Using Voronoi diagrams to improve classification performances when modeling imbalanced datasets
    Young, William A., II
    Nykl, Scott L.
    Weckman, Gary R.
    Chelberg, David M.
    NEURAL COMPUTING & APPLICATIONS, 2015, 26 (05): : 1041 - 1054
  • [49] Using Voronoi diagrams to improve classification performances when modeling imbalanced datasets
    William A. Young
    Scott L. Nykl
    Gary R. Weckman
    David M. Chelberg
    Neural Computing and Applications, 2015, 26 : 1041 - 1054
  • [50] Using a Many-Objective Optimization Algorithm to Select Sampling Approaches for Imbalanced Datasets
    Miranda, Pericles B. C.
    Morais, Romero F. A. B.
    Silva, Ricardo M. A.
    2018 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2018, : 2324 - 2330