A Novel Imbalanced Data Classification Method Based on Weakly Supervised Learning for Fault Diagnosis

被引:36
|
作者
Liu, Hui [1 ]
Liu, Zhenyu [1 ]
Jia, Weiqiang [1 ,2 ]
Zhang, Donghao [1 ]
Tan, Jianrong [1 ]
机构
[1] Zhejiang Univ, State Key Lab Comp Aided Design & Comp Graph, Hangzhou 310027, Peoples R China
[2] Zhejiang Lab, Hangzhou 311121, Peoples R China
基金
中国国家自然科学基金;
关键词
Fault diagnosis; Supervised learning; Support vector machines; Classification algorithms; Informatics; Prognostics and health management; Prediction algorithms; Bidirectional gated recurrent units (BGRU); class imbalance; support vector machine (SVM); weakly supervised learning; SMOTE;
D O I
10.1109/TII.2021.3084132
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The class imbalance problem has a huge impact on the performance of diagnostic models. When it occurs, the minority samples are easily ignored by classification models. Besides, the distribution of class imbalanced data differs from the actual data distribution, which makes it difficult for classifiers to learn an accurate decision boundary. To tackle the above issues, this article proposes a novel imbalanced data classification method based on weakly supervised learning. First, Bagging algorithm is employed to sample majority data randomly to generate several relatively balanced subsets, which are then used to train several support vector machine (SVM) classifiers. Next, these trained SVM classifiers are adopted to predict the labels of those unlabeled data, and samples that are predicted as minority class are added to the original dataset to reduce the imbalance ratio. The critical idea of this article is to introduce real-world samples into the imbalanced dataset by virtue of weakly supervised learning. In addition, bidirectional gated recurrent units are used to construct a diagnostic model for fault diagnosis, and a new weighted cross-entropy function is proposed as the loss function to reduce the impact of noise. Besides, it also increases the model's attention to the original minority samples. Furthermore, experimental evaluations of the proposed method are conducted on two datasets, i.e., Prognostics and Health Management challenge 2008 and 2010 datasets, and the experimental results demonstrate the effectiveness and superiority of the proposed method.
引用
收藏
页码:1583 / 1593
页数:11
相关论文
共 50 条
  • [41] A Novel Multimode Fault Classification Method Based on Deep Learning
    Zhou F.
    Gao Y.
    Wen C.
    Gao, Yulin (gaoyulinhn@163.com), 1600, Hindawi Limited, 410 Park Avenue, 15th Floor, 287 pmb, New York, NY 10022, United States (2017):
  • [42] Efficient multi-kernel multi-instance learning using weakly supervised and imbalanced data for diabetic retinopathy diagnosis
    Cao, Peng
    Ren, Fulong
    Wan, Chao
    Yang, Jinzhu
    Zaiane, Osmar
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2018, 69 : 112 - 124
  • [43] Gaussian Mixture Based Semi Supervised Boosting For Imbalanced Data Classification
    Paul, Mahit Kumar
    Pal, Biprodip
    2016 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER & TELECOMMUNICATION ENGINEERING (ICECTE), 2016,
  • [44] GAN-Based Semi-supervised For Imbalanced Data Classification
    Zhou, Tingting
    Liu, Wei
    Zhou, Congyu
    Chen, Leiting
    2018 4TH INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT (ICIM2018), 2018, : 17 - 21
  • [45] Semi-supervised Classification Based Mixed Sampling for Imbalanced Data
    Zhao, Jianhua
    Liu, Ning
    OPEN PHYSICS, 2019, 17 (01): : 975 - 983
  • [46] ReF-DDPM: A novel DDPM-based data augmentation method for imbalanced rolling bearing fault diagnosis
    Yu, Tian
    Li, Chaoshun
    Huang, Jie
    Xiao, Xiangqu
    Zhang, Xiaoyuan
    Li, Yuhong
    Fu, Bitao
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2024, 251
  • [47] Comparative Study on Defect Prediction Algorithms of Supervised Learning Software Based on Imbalanced Classification Data Sets
    Ge, Jianxin
    Liu, Jiaomin
    Liu, Wenyuan
    2018 19TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2018, : 399 - 406
  • [48] Supervised kernel-based multi-modal Bhattacharya distance learning for imbalanced data classification
    Mojahed, Atena Jalali
    Moattar, Mohammad Hossein
    Ghaffari, Hamidreza
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, : 247 - 272
  • [49] Weakly supervised text classification framework for noisy-labeled imbalanced
    Zhang, Wenxin
    Zhou, Yaya
    Liu, Shuhui
    Zhang, Yupei
    Shang, Xuequn
    NEUROCOMPUTING, 2024, 610
  • [50] New supervised class imbalance method for highly imbalanced video data classification
    Apandi, Ziti Fariha Mohd
    Mustapha, Norwati
    Affendey, Lilly Suriani
    International Review on Computers and Software, 2012, 7 (01) : 113 - 121