DOSS: Dual Over Sampling Strategy for Imbalanced Data Classification

被引:0
|
作者
Wang, Qiushi [1 ]
Lee, Kee Jin [1 ]
Hong, Jihoon [1 ]
机构
[1] ASTAR, Mfg Execut & Control Grp, Singapore Inst Mfg Technol SIMTech, Singapore, Singapore
关键词
imbalanced classification; oversampling; cGAN; SMOTE;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Imbalanced datasets are often encountered in process monitoring, where the data reflecting abnormal events like machine failures is less than the data reflecting normal events. The former is called the minority class and the later is referred as the majority class. Classical machine learning algorithms are still facing challenges in solving this problem. In order to improve the classification accuracy, oversampling techniques rebalance the dataset by supplying the minority class with synthetic samples. However, the latent sample spaces of both classes are broad, the majority class might be under-represented as well. In this paper, we propose a dual oversampling strategy (DOSS) to generate samples for both classes. For the majority class, synthetic samples are generated according to the data distribution, which is approximated by conditional Generative Adversarial Network (cGAN). For the minority class, Synthetic Minority Over-sampling Technique (SMOTE) is applied as the oversampling method. The proposed strategy is compared with others that either only the minority class is oversampled or both classes are oversampled with different strategies. Recall, G-mean and F-measure are used as the metrics. The experimental results on 12 benchmark datasets show the improved performance of our proposed strategy. DOSS is further applied to detect the faulty stages of an injection moulding machine where the prediction of DOSS achieves a better accuracy.
引用
收藏
页码:5389 / 5394
页数:6
相关论文
共 50 条
  • [1] Over-sampling algorithm for imbalanced data classification
    Xu Xiaolong
    Chen Wen
    Sun Yanfei
    [J]. JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2019, 30 (06) : 1182 - 1191
  • [2] Over-sampling algorithm for imbalanced data classification
    XU Xiaolong
    CHEN Wen
    SUN Yanfei
    [J]. Journal of Systems Engineering and Electronics, 2019, 30 (06) : 1182 - 1191
  • [3] An Effective Over-sampling Method for Imbalanced Data Sets Classification
    Zhai Yun
    Ma Nan
    Ruan Da
    An Bing
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2011, 20 (03) : 489 - 494
  • [4] Multiple adaptive over-sampling for imbalanced data evidential classification
    Zhang, Zhen
    Tian, Hong -peng
    Jin, Jin-shuai
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [5] Denoise-Based Over-Sampling for Imbalanced Data Classification
    Dan, Wang
    Yian, Liu
    [J]. 2020 19TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS ENGINEERING AND SCIENCE (DCABES 2020), 2020, : 275 - 278
  • [6] RWO-Sampling: A random walk over-sampling approach to imbalanced data classification
    Zhang, Huaxiang
    Li, Mingfang
    [J]. INFORMATION FUSION, 2014, 20 : 99 - 116
  • [7] An Approach to Imbalanced Data Classification Based on Instance Selection and Over-Sampling
    Czarnowski, Ireneusz
    Jedrzejowicz, Piotr
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE, PT I, 2019, 11683 : 601 - 610
  • [8] Enriched Over-Sampling Techniques for Improving Classification of Imbalanced Big Data
    Patil, Sachin Subhash
    Sonavane, Shefali Pratap
    [J]. 2017 THIRD IEEE INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING SERVICE AND APPLICATIONS (IEEE BIGDATASERVICE 2017), 2017, : 1 - 10
  • [9] Clustering boundary over-sampling classification method for imbalanced data sets
    Lou, Xiao-Jun
    Sun, Yu-Xuan
    Liu, Hai-Tao
    [J]. Liu, H.-T. (liuhaitao@wsn.cn), 1600, Zhejiang University (47): : 944 - 950
  • [10] Abstention-SMOTE: An over-sampling approach for imbalanced data classification
    Zhang, Cheng
    Chen, Yufei
    Liu, Xianhui
    Zhao, Xiaodong
    [J]. PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY (ICIT 2017), 2017, : 17 - 21