FUZZY AND SMOTE RESAMPLING TECHNIQUE FOR IMBALANCED DATA SETS

被引:0
|
作者
Zorkeflee, Maisarah [1 ]
Din, Aniza Mohamed [1 ]
Ku-Mahamud, Ku Ruhana [1 ]
机构
[1] Univ Utara Malaysia, Kedah, Malaysia
关键词
imbalanced data; fuzzy logic; fuzzy distance-based undersampling; SMOTE; CLASSIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There are many factors that could affect the performance of a classifier. One of these factors is having imbalanced datasets which could lead to problem in classification accuracy. In binary classification, classifier often ignores instances in minority class. Resampling technique, specifically, undersampling and oversampling are the techniques that are commonly used to overcome the problem related to imbalanced data sets. In this study, an integration of undersampling and oversampling techniques is proposed to improve classification accuracy. The proposed technique is an integration between Fuzzy Distance-based Undersampling and SMOTE. The findings from the study indicate that the proposed combination technique is able to produce more balanced datasets to improve the classification accuracy.
引用
收藏
页码:638 / 643
页数:6
相关论文
共 50 条
  • [1] A multiple resampling method for learning from imbalanced data sets
    Estabrooks, A
    Jo, TH
    Japkowicz, N
    [J]. COMPUTATIONAL INTELLIGENCE, 2004, 20 (01) : 18 - 36
  • [2] DTO-SMOTE: Delaunay Tessellation Oversampling for Imbalanced Data Sets
    de Carvalho, Alexandre M.
    Prati, Ronaldo C.
    [J]. INFORMATION, 2020, 11 (12) : 1 - 22
  • [3] SMOTE-IF: A Novel Resampling Method Based on SMOTE Using Isolation Forest Variants for Multi-Class Imbalanced Data
    Li, Ang
    Ma, Tingting
    Ye, Sen
    Liu, Xunyun
    [J]. 2023 IEEE INTERNATIONAL CONFERENCES ON INTERNET OF THINGS, ITHINGS IEEE GREEN COMPUTING AND COMMUNICATIONS, GREENCOM IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING, CPSCOM IEEE SMART DATA, SMARTDATA AND IEEE CONGRESS ON CYBERMATICS,CYBERMATICS, 2024, : 570 - 577
  • [4] Stacked generalizations in imbalanced fraud data sets using resampling methods
    Kerwin, Kathleen R.
    Bastian, Nathaniel D.
    [J]. JOURNAL OF DEFENSE MODELING AND SIMULATION-APPLICATIONS METHODOLOGY TECHNOLOGY-JDMS, 2021, 18 (03): : 175 - 192
  • [5] Surrounding neighborhood-based SMOTE for learning from imbalanced data sets
    V. García
    J. S. Sánchez
    R. Martín-Félez
    R. A. Mollineda
    [J]. Progress in Artificial Intelligence, 2012, 1 (4) : 347 - 362
  • [6] Surrounding neighborhood-based SMOTE for learning from imbalanced data sets
    Garcia, V.
    Sanchez, J. S.
    Martin-Felez, R.
    Mollineda, R. A.
    [J]. PROGRESS IN ARTIFICIAL INTELLIGENCE, 2012, 1 (04) : 347 - 362
  • [7] Clustering Algorithms on Imbalanced Data Using the SMOTE Technique for Image Segmentation
    Abeysinghe, Wajira
    Hung, Chih-Cheng
    Bechikh, Slim
    Wang, Xiaosong
    Rattani, Altaf
    [J]. PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 17 - 22
  • [8] Addressing data complexity for imbalanced data sets: analysis of SMOTE-based oversampling and evolutionary undersampling
    Julián Luengo
    Alberto Fernández
    Salvador García
    Francisco Herrera
    [J]. Soft Computing, 2011, 15 : 1909 - 1936
  • [9] Determining Resampling Ratios Using BSMOTE and SVM-SMOTE for Identifying Rare Attacks in Imbalanced Cybersecurity Data
    Bagui, Sikha S.
    Mink, Dustin
    Bagui, Subhash C.
    Subramaniam, Sakthivel
    [J]. COMPUTERS, 2023, 12 (10)
  • [10] SMOTE-RSB*: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory
    Enislay Ramentol
    Yailé Caballero
    Rafael Bello
    Francisco Herrera
    [J]. Knowledge and Information Systems, 2012, 33 : 245 - 265