A Hybrid Sampling Method for Imbalanced Data

被引:0
|
作者
Gazzah, Sami [1 ]
Hechkel, Amina [1 ]
Ben Amara, Najoua Essoukri [1 ]
机构
[1] Univ Sousse, Tunisia SAGE, Adv Syst Elect Engn, Natl Engn Sch Sousse, Sousse, Tunisia
关键词
Imbalanced data sets; Intra-class variations; Data analysis; Principal component analysis; One-against-all SVM; CLASSIFICATION;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With the diversification of applications and the emergence of new trends in challenging applications such as in the computer vision domain, classical machine learning systems usually perform poorly while confronting two common problems: the training data of negative examples, which outnumber the positive ones, and the large intra-class variations. These problems lead to a drop in the system performances. In this work, we propose to improve the classification accuracy in the case of imbalanced training data by equally balancing a training data set using a hybrid approach which consists in over-sampling the minority class using a "SMOTE star topology", and under-sampling the majority class by removing instances that are considered less relevant. The feature vector deletion has been performed with respect to intra-class variations, based on the distribution criterion. The experimental results, achieved in biometric data, show that the proposed approach significantly improves the overall performances measured in terms of true-positive rate.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Hybrid sampling for imbalanced data
    Seiffert, Chris
    Khoshgoftaar, Taghi M.
    Van Hulse, Jason
    [J]. PROCEEDINGS OF THE 2008 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2008, : 202 - 207
  • [2] Hybrid sampling for imbalanced data
    Seiffert, Chris
    Khoshgoftaar, Taghi M.
    Van Hulse, Jason
    [J]. INTEGRATED COMPUTER-AIDED ENGINEERING, 2009, 16 (03) : 193 - 210
  • [3] Hybrid Sampling Method for Overlap Region of ICS Imbalanced Data
    Gao, Bing
    Gu, Zhaojun
    Zhou, Jingxian
    Sui, He
    [J]. Computer Engineering and Applications, 2023, 59 (19) : 305 - 315
  • [4] HSDP: A Hybrid Sampling Method for Imbalanced Big Data Based on Data Partition
    Chen, Liping
    Jiang, Jiabao
    Zhang, Yong
    [J]. COMPLEXITY, 2021, 2021
  • [5] HSDLM: A Hybrid Sampling With Deep Learning Method for Imbalanced Data Classification
    Hasib, Khan Md
    Towhid, Nurul Akter
    Islam, Md Rafiqul
    [J]. INTERNATIONAL JOURNAL OF CLOUD APPLICATIONS AND COMPUTING, 2021, 11 (04) : 1 - 13
  • [6] A Hybrid Under-Sampling Method (HUSBoost) to Classify Imbalanced Data
    Popel, Mahmudul Hasan
    Hasib, Khan Md
    Habib, Syed Ahsan
    Shah, Faisal Muhammad
    [J]. 2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
  • [7] A hybrid sampling method for highly imbalanced and overlapped data classification with complex distribution
    Liu, Yansong
    Zhu, Li
    Ding, Lei
    Sui, He
    Shang, Wenli
    [J]. INFORMATION SCIENCES, 2024, 661
  • [8] A Hybrid Sampling SVM Approach to Imbalanced Data Classification
    Wang, Qiang
    [J]. ABSTRACT AND APPLIED ANALYSIS, 2014,
  • [9] A Constructive Method for Data Reduction and Imbalanced Sampling
    Liu, Fei
    Yan, Yuanting
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT III, 2024, 14489 : 476 - 489
  • [10] A New Combination Sampling Method for Imbalanced Data
    Li, Hu
    Zou, Peng
    Wang, Xiang
    Xia, Rongze
    [J]. PROCEEDINGS OF 2013 CHINESE INTELLIGENT AUTOMATION CONFERENCE: INTELLIGENT INFORMATION PROCESSING, 2013, 256 : 547 - 554