Margin calibration in SVM class-imbalanced learning

被引:52
|
作者
Yang, Chan-Yun [1 ]
Yang, Jr-Syu [2 ]
Wang, Jian-Jun [3 ]
机构
[1] Technol & Sci Inst No Taiwan, Dept Mech Engn, Taipei 11202, Taiwan
[2] Tamkang Univ, Dept Mech & Electromech Engn, Tamsui 25137, Taipei County, Taiwan
[3] Southwest Univ, Sch Math & Stat, Chongqing 400715, Peoples R China
关键词
Margin; Cost-sensitive learning; Class-imbalanced learning; Support vector machines; Classification; SUPPORT VECTOR MACHINES; CLASSIFICATION; KERNEL; CONSISTENCY;
D O I
10.1016/j.neucom.2009.08.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Imbalanced dataset learning is an important practical issue in machine learning, even in support vector machines (SVMs). In this study, a well known reference model for solving the problem proposed by Veropoulos et al., is first studied. From the aspect of loss function, the reference cost sensitive prototype is identified as a penalty-regularized model. Intuitively, the loss function can change not only the penalty but also the margin to recover the biased decision boundary. This study focuses mainly on the effect from the margin and then extends the model to a more general modification. As proposed in the prototype, the modification first adopts an inversed proportional regularized penalty to re-weight the imbalanced classes. In addition to the penalty regularization, the modification then employs a margin compensation to lead the margin to be lopsided, which enables the decision boundary drift. Two regularization factors, the penalty and margin. are hence suggested for achieving an unbiased classification. The margin compensation, associating with the penalty regularization, is here utilized to calibrate and refine the biased decision boundary to further reduce the bias. With the area under the receiver operating characteristic curve (AuROC) for examining the performance, the modification shows relative higher scores than the reference model, even though the optimal performance is achieved by the reference model. Some useful characteristics found empirically are also included, which may be convenient for the future applications. All the theoretical descriptions and experimental validations show the proposed model's potential to compete for highly unbiased accuracy in a complex imbalanced dataset. (C) 2009 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:397 / 411
页数:15
相关论文
共 50 条
  • [21] A Hybrid Framework for Class-Imbalanced Classification
    Chen, Rui
    Luo, Lailong
    Chen, Yingwen
    Xia, Junxu
    Guo, Deke
    WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT I, 2021, 12937 : 301 - 313
  • [22] Research On Classification Method Of High-Dimensional Class-Imbalanced Data Sets Based On SVM
    Zhang, Chunkai
    Guo, Jianwei
    Lu, Junru
    2017 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA SCIENCE IN CYBERSPACE (DSC), 2017, : 60 - 67
  • [23] Weed recognition using deep learning techniques on class-imbalanced imagery
    Hasan, A. S. M. Mahmudul
    Sohel, Ferdous
    Diepeveen, Dean
    Laga, Hamid
    Jones, Michael G. K.
    CROP & PASTURE SCIENCE, 2023, 74 (06): : 628 - 644
  • [24] Multitask Semi-Supervised Learning for Class-Imbalanced Discourse Classification
    Spangher, Alexander
    May, Jonathan
    Shiang, Sz-rung
    Deng, Lingjia
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 498 - 517
  • [25] A post-processing framework for class-imbalanced learning in a transductive setting
    Jiang, Zhen
    Lu, Yu
    Zhao, Lingyun
    Zhan, Yongzhao
    Mao, Qirong
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [26] Class-imbalanced complementary-label learning via weighted loss
    Wei, Meng
    Zhou, Yong
    Li, Zhongnian
    Xu, Xinzheng
    NEURAL NETWORKS, 2023, 166 : 555 - 565
  • [27] Large-Scale Learning with Structural Kernels for Class-Imbalanced Datasets
    Severyn, Aliaksei
    Moschitti, Alessandro
    ETERNAL SYSTEMS, 2012, 255 : 34 - 41
  • [28] Dealing with high-dimensional class-imbalanced datasets: Embedded feature selection for SVM classification
    Maldonado, Sebastian
    Lopez, Julio
    APPLIED SOFT COMPUTING, 2018, 67 : 94 - 105
  • [29] Tuning model parameters in class-imbalanced learning with precision-recall curve
    Fu, Guang-Hui
    Yi, Lun-Zhao
    Pan, Jianxin
    BIOMETRICAL JOURNAL, 2019, 61 (03) : 652 - 664
  • [30] Assembly Quality Detection Based on Class-Imbalanced Semi-Supervised Learning
    Lu, Zichen
    Jiang, Jiabin
    Cao, Pin
    Yang, Yongying
    APPLIED SCIENCES-BASEL, 2021, 11 (21):