A Modified Borderline Smote with Noise Reduction in Imbalanced Datasets

被引:8
|
作者
Revathi, M. [1 ]
Ramyachitra, D. [1 ]
机构
[1] Bharathiar Univ, Dept Comp Sci, Coimbatore 641046, Tamil Nadu, India
关键词
Imbalanced data; Noise reduction; Oversampling; NRBSID; FEATURE-SELECTION; DATA-SETS; CLASSIFICATION; CHALLENGES; ALGORITHM;
D O I
10.1007/s11277-021-08690-y
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
In the real world, noisy data brings tremendous challenges to data mining. Traditional classification methods are proven to be inadequate to assess the efficacy of the data mining methods while using noisy and imbalanced data. Therefore, preprocessing the imbalanced data is necessary before classification. But it's difficult to arrive at an appropriate classifier for minority class in the imbalanced data. This paper proposes the hybridization of two techniques, Noise reduction and oversampling techniques which only oversamples or strengthens the borderline minority class. The proposed technique is applied on 49 datasets at several imbalanced ratios. The Decision Tree, Gaussian Naive Bayes, Logistic Regression, Neural Network, Non-linear SVM, Random Forest, and SVM using Linear Kernel classifiers are applied for getting validation through experiments. These experimental outputs show the proposed oversampling method is superior giving accurate results in imbalanced data than the random oversampling approach.
引用
收藏
页码:1659 / 1680
页数:22
相关论文
共 50 条
  • [1] A Modified Borderline Smote with Noise Reduction in Imbalanced Datasets
    M. Revathi
    D. Ramyachitra
    [J]. Wireless Personal Communications, 2021, 121 : 1659 - 1680
  • [2] Analysis of SMOTE: Modified for Diverse Imbalanced Datasets Under the IoT Environment
    Bansal, Ankita
    Saini, Makul
    Singh, Rakshit
    Yadav, Jai Kumar
    [J]. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2021, 11 (02) : 15 - 37
  • [3] PF-SMOTE: A novel parameter-free SMOTE for imbalanced datasets
    Chen, Qiong
    Zhang, Zhong-Liang
    Huang, Wen-Po
    Wu, Jian
    Luo, Xing-Gang
    [J]. NEUROCOMPUTING, 2022, 498 : 75 - 88
  • [4] A-SMOTE: A New Preprocessing Approach for Highly Imbalanced Datasets by Improving SMOTE
    Ahmed Saad Hussein
    Tianrui Li
    Chubato Wondaferaw Yohannese
    Kamal Bashir
    [J]. International Journal of Computational Intelligence Systems, 2019, 12 : 1412 - 1422
  • [5] A-SMOTE: A New Preprocessing Approach for Highly Imbalanced Datasets by Improving SMOTE
    Hussein, Ahmed Saad
    Li, Tianrui
    Yohannese, Chubato Wondaferaw
    Bashir, Kamal
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2019, 12 (02) : 1412 - 1422
  • [6] Geometric SMOTE for imbalanced datasets with nominal and continuous features
    Fonseca, Joao
    Bacao, Fernando
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 234
  • [7] Learning imbalanced datasets based on SMOTE and Gaussian distribution
    Pan, Tingting
    Zhao, Junhong
    Wu, Wei
    Yang, Jie
    [J]. INFORMATION SCIENCES, 2020, 512 : 1214 - 1233
  • [8] Kernel-Based SMOTE for SVM Classification of Imbalanced Datasets
    Mathew, Josey
    Luo, Ming
    Pang, Chee Khiang
    Chan, Hian Leng
    [J]. IECON 2015 - 41ST ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2015, : 1127 - 1132
  • [9] Applying Threshold SMOTE Algorithm with Attribute Bagging to Imbalanced Datasets
    Wang, Jin
    Yun, Bo
    Huang, Pingli
    Liu, Yu-Ao
    [J]. ROUGH SETS AND KNOWLEDGE TECHNOLOGY: 8TH INTERNATIONAL CONFERENCE, 2013, 8171 : 221 - 228
  • [10] Combination Approach of SMOTE and Biased-SVM for Imbalanced Datasets
    Wang He-Yong
    [J]. 2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 228 - 231