Binary Imbalanced Data Classification Based on Modified D2GAN Oversampling and Classifier Fusion

被引:3
|
作者
Zhai, Junhai [1 ]
Qi, Jiaxing [1 ]
Zhang, Sufang [2 ]
机构
[1] Hebei Univ, Coll Math & Informat Sci, Hebei Key Lab Machine Learning & Computat Intelli, Baoding 071002, Peoples R China
[2] China Meteorol Adm, Hebei Branch China Meteorol Adm Training Ctr, Baoding 071000, Peoples R China
关键词
Gallium nitride; Generative adversarial networks; Generators; Training; Diversity methods; Data models; Machine learning; Binary class imbalance; diversity oversampling; generative adversarial network; classifier fusion; fuzzy integral; SMOTE; ENSEMBLE; PREDICTION; MACHINE;
D O I
10.1109/ACCESS.2020.3023949
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Binary imbalance problem refers to such a classification scenario where one class contains a large number of samples while another class contains only a few samples. When traditional classifiers face with imbalanced datasets, they usually bias towards majority class resulting in poor classification performance. Oversampling is an effective method to address this problem, yet how to conduct diversity oversampling is a challenge. In this article, we proposed a diversity oversampling method based on a modified D2GAN model, and on the basis of diversity oversampling, we also proposed a binary imbalanced data classification approach based on classifier fusion by fuzzy integral. Extensive experiments are conducted on 8 data sets to compare the proposed methods with 7 state-of-the-art methods on 5 aspects: MMD-score, Silhouette-score, F-measure, G-means, and AUC-area. The 7 methods include 4 SMOTE related approaches and 3 GAN related approaches. The experimental results demonstrate that the proposed methods are more effective and efficient than the compared approaches.
引用
收藏
页码:169456 / 169469
页数:14
相关论文
共 50 条
  • [41] Clustering-based Binary-class Classification for Imbalanced Data Sets
    Chen, Chao
    Shyu, Mei-Ling
    2011 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI), 2011, : 384 - 389
  • [42] MI-MOTE: Multiple imputation-based minority oversampling technique for imbalanced and incomplete data classification
    Shin, Kyoham
    Han, Jongmin
    Kang, Seokho
    INFORMATION SCIENCES, 2021, 575 : 80 - 89
  • [43] A hierarchical heterogeneous ant colony optimization based oversampling algorithm using feature similarity for classification of imbalanced data
    Sreeja, N. K.
    Sreelaja, N. K.
    APPLIED SOFT COMPUTING, 2024, 166
  • [44] A dual encoder DAE neural network for imbalanced binary classification based on NSGA-III and GAN
    Jiantao Qu
    Feng Liu
    Yuxiang Ma
    Pattern Analysis and Applications, 2022, 25 : 17 - 34
  • [45] A dual encoder DAE neural network for imbalanced binary classification based on NSGA-III and GAN
    Qu, Jiantao
    Liu, Feng
    Ma, Yuxiang
    PATTERN ANALYSIS AND APPLICATIONS, 2022, 25 (01) : 17 - 34
  • [46] A Feature Selection Model for Binary Classification of Imbalanced Data Based on Preference for Target Instances
    Tan, Ding-Wen
    Liew, Soung-Yue
    Tan, Teik-Boon
    Yeoh, William
    2012 4TH CONFERENCE ON DATA MINING AND OPTIMIZATION (DMO), 2012, : 35 - 42
  • [47] Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification
    Li, Jinyan
    Fong, Simon
    Sung, Yunsick
    Cho, Kyungeun
    Wong, Raymond
    Wong, Kelvin K. L.
    BIODATA MINING, 2016, 9 : 1 - 15
  • [48] Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification
    Jinyan Li
    Simon Fong
    Yunsick Sung
    Kyungeun Cho
    Raymond Wong
    Kelvin K. L. Wong
    BioData Mining, 9
  • [49] M2GDL: Multi-manifold guided dictionary learning based oversampling and data validation for highly imbalanced classification problems
    Feizi, Tayyebe
    Moattar, Mohammad Hossein
    Tabatabaee, Hamid
    INFORMATION SCIENCES, 2024, 682
  • [50] EMRIL: Ensemble Method based on ReInforcement Learning for binary classification in imbalanced drifting data streams
    Usman, Muhammad
    Chen, Huanhuan
    NEUROCOMPUTING, 2024, 605