A Wasserstein Distance-Based Cost-Sensitive Framework for Imbalanced Data Classification

被引:0
|
作者
Feng, Rui [1 ]
Ji, Hongbing [1 ]
Zhu, Zhigang [1 ]
Wang, Lei [1 ]
机构
[1] Xidian Univ, Sch Elect Engn, Xian, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced classification; cost-sensitive; structural information; Wasserstein distance; radar emitter signal; SUPPORT VECTOR MACHINE; DECISION TREE; SYSTEMS;
D O I
10.13164/re.2023.0451
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Class imbalance is a prevalent problem in many real-world applications, and imbalanced data distribution can dramatically skew the performance of classifiers. In general, the higher the imbalance ratio of a dataset, the more difficult it is to classify. However, it is found that standard classifiers can still achieve good classification results on some highly imbalanced datasets. Obviously, the class imbalance is only a superficial characteristic of the data, and the underlying structural information is often the key factor affecting the classification performance. As implicit prior knowledge, structural information has been validated to be crucial for designing a good classifier. This paper proposes a Wasserstein-based cost-sensitive support vector machine (CS-WSVM) for class imbalance learning, incorporating prior structural information and a costsensitive strategy. The Wasserstein distance is introduced to model the distribution of majority and minority samples to capture the structural information, which is employed to weight the majority and minority samples. Comprehensive experiments on synthetic and real-world datasets, especially on the radar emitter signal dataset, demonstrated that CS-WSVM can achieve outstanding performance in imbalanced scenarios.
引用
收藏
页码:451 / 466
页数:16
相关论文
共 50 条
  • [1] Cost-sensitive boosting for classification of imbalanced data
    Sun, Yamnin
    Kamel, Mohamed S.
    Wong, Andrew K. C.
    Wang, Yang
    [J]. PATTERN RECOGNITION, 2007, 40 (12) : 3358 - 3378
  • [2] COST-SENSITIVE SPFCNN MINER FOR CLASSIFICATION OF IMBALANCED DATA
    Zhao, Linchang
    Shang, Zhaowei
    Zhao, Ling
    Wei, Yu
    Tang, Yuan Yan
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION (ICWAPR), 2019, : 51 - 57
  • [3] A Statistical Approach to Cost-Sensitive AdaBoost for Imbalanced Data Classification
    Bei, Honghan
    Wang, Yajie
    Ren, Zhaonuo
    Jiang, Shuo
    Li, Keran
    Wang, Wenyang
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
  • [4] Cost-Sensitive Variational Autoencoding Classifier for Imbalanced Data Classification
    Liu, Fen
    Qian, Quan
    [J]. ALGORITHMS, 2022, 15 (05)
  • [5] Ensemble cost-sensitive hypernetwork models for imbalanced data classification
    [J]. Sun, Kaiwei, 1600, Binary Information Press (10):
  • [6] Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data
    Lopez, Victoria
    del Rio, Sara
    Manuel Benitez, Jose
    Herrera, Francisco
    [J]. FUZZY SETS AND SYSTEMS, 2015, 258 : 5 - 38
  • [7] Cost-sensitive incremental Classification under the MapReduce framework for Mining Imbalanced Massive Data Streams
    Huang Yuwen
    [J]. JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2015, 18 (1-2): : 177 - 194
  • [8] Large cost-sensitive margin distribution machine for imbalanced data classification
    Cheng, Fanyong
    Zhang, Jing
    Wen, Cuihong
    Liu, Zhaohua
    Li, Zuoyong
    [J]. NEUROCOMPUTING, 2017, 224 : 45 - 57
  • [9] Cost-Sensitive Large margin Distribution Machine for classification of imbalanced data
    Cheng, Fanyong
    Zhang, Jing
    Wen, Cuihong
    [J]. PATTERN RECOGNITION LETTERS, 2016, 80 : 107 - 112
  • [10] Cost-Sensitive Learning based on Performance Metric for Imbalanced Data
    Aurelio, Yuri Sousa
    de Almeida, Gustavo Matheus
    de Castro, Cristiano Leite
    Braga, Antonio Padua
    [J]. NEURAL PROCESSING LETTERS, 2022, 54 (04) : 3097 - 3114