A New Hybrid Sampling Approach for Classification of Imbalanced Datasets

被引:0
|
作者
Hanskunatai, Anantaporn [1 ]
机构
[1] King Mongkuts Inst Technol Ladkrabang, Dept Comp Sci, Adv Artificial Intelligence Res Lab, Bangkok 10520, Thailand
关键词
imbalanced dataset; SMOTE; DBSCAN; hybrid sampling; decision tree; naive bayes;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Nowadays it is an era of data driven. Many organizations around the world including bank, industry, commercial, and medical intend to extract knowledge from a huge of data. But in the real-word datasets, most of them occur class imbalance problems. This paper presents a new algorithm to handle an imbalanced classification. The proposed technique is a hybrid sampling approach which is the combination of a well know oversampling algorithm called SMOTE and the undersampling technique by removing the ambiguous instances from the majority class instances. The experimental results show that the new hybrid sampling method yields the better predictive performance in term of F-measure when compare with other sampling techniques. In addition, it can improve f-measure up to 59.73% and 412.26% when compare with the original dataset based on decision tree learning and naive bayes classifiers respectively.
引用
收藏
页码:67 / 71
页数:5
相关论文
共 50 条
  • [1] A New Hybrid Under-sampling Approach to Imbalanced Classification Problems
    Peng, Chun-Yang
    Park, You-Jin
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2022, 36 (01)
  • [2] A Hybrid Sampling SVM Approach to Imbalanced Data Classification
    Wang, Qiang
    [J]. ABSTRACT AND APPLIED ANALYSIS, 2014,
  • [3] ARCID: A New Approach to Deal with Imbalanced Datasets Classification
    Abdellatif, Safa
    Ben Hassine, Mohamed Ali
    Ben Yahia, Sadok
    Bouzeghoub, Amel
    [J]. SOFSEM 2018: THEORY AND PRACTICE OF COMPUTER SCIENCE, 2018, 10706 : 569 - 580
  • [4] CLUS: A New Hybrid Sampling Classification for Imbalanced Data
    Prachuabsupakij, Wanthanee
    [J]. PROCEEDINGS OF THE 2015 12TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE), 2015, : 281 - 286
  • [5] A Hybrid Approach Handling Imbalanced Datasets
    Soda, Paolo
    [J]. IMAGE ANALYSIS AND PROCESSING - ICIAP 2009, PROCEEDINGS, 2009, 5716 : 209 - 218
  • [6] A cluster-based hybrid sampling approach for imbalanced data classification
    Feng, Shou
    Zhao, Chunhui
    Fu, Ping
    [J]. REVIEW OF SCIENTIFIC INSTRUMENTS, 2020, 91 (05):
  • [7] Empirical Study of Sampling Methods for Classification in Imbalanced Clinical Datasets
    Kasem, Asem
    Ghaibeh, A. Ammar
    Moriguchi, Hiroki
    [J]. COMPUTATIONAL INTELLIGENCE IN INFORMATION SYSTEMS, CIIS 2016, 2017, 532 : 152 - 162
  • [8] Balanced Sampling Meets Imbalanced Datasets in SAR Image Classification
    Jahan, Chowdhury Sadman
    Savakis, Andreas
    [J]. GEOSPATIAL INFORMATICS XIII, 2023, 12525
  • [9] A New Sampling Approach for Classification of Imbalanced Data sets with High Density
    Jia Pengfei
    Zhang Chunkai
    He Zhenyu
    [J]. 2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 217 - 222
  • [10] An Evolutionary Sampling Approach for Classification with Imbalanced Data
    Fernandes, Everlandio R. Q.
    de Carvalho, Andre C. P. L. F.
    Coelho, Andre L. V.
    [J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,