A Novel Method for Highly Imbalanced Classification with Weighted Support Vector Machine

被引:2
|
作者
Qi, Biao [1 ,2 ]
Jiang, Jianguo [1 ,2 ]
Shi, Zhixin [1 ,2 ]
Li, Meimei [1 ,2 ]
Fan, Wei [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Highly imbalanced classification; Undersampling; GWSVM-RU; Information granules; Weighted SVMs;
D O I
10.1007/978-3-030-29551-6_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real life, the problem of imbalanced data classification is unavoidable and difficult to solve. Traditional SVMs based classification algorithms usually cannot classify highly imbalanced data accurately, and sampling strategies are widely used to help settle the matter. In this paper, we put forward a novel undersampling method i.e., granular weighted SVMs-repetitive under-sampling (GWSVM-RU) for highly imbalanced classification, which is a weighted SVMs version of the granular SVMs-repetitive undersampling (GSVM-RU) once proposed by Yuchun Tang et al. We complete the undersampling operation by extracting the negative information granules repetitively which are obtained through the naive SVMs algorithm, and then combine the negative and positive granules again to compose the new training data sets. Thus we rebalance the original imbalanced data sets and then build new models by weighted SVMs to predict the testing data set. Besides, we explore four other rebalance heuristic mechanisms including cost-sensitive learning, undersampling, oversampling and GSVM-RU, our approach holds the higher classification performance defined by new evaluation metrics including G-Mean, F-Measure and AUC-ROC. Theories and experiments reveal that our approach outperforms other methods.
引用
下载
收藏
页码:275 / 286
页数:12
相关论文
共 50 条
  • [1] Weighted L-1-Norm Support Vector Machine for the Classification of Highly Imbalanced Data
    Kim, Eunkyung
    Jhun, Myoungshic
    Bang, Sungwan
    KOREAN JOURNAL OF APPLIED STATISTICS, 2015, 28 (01) : 9 - 21
  • [2] Clustering and Weighted Scoring in Geometric Space Support Vector Machine Ensemble for Highly Imbalanced Data Classification
    Ksieniewicz, Pawel
    Burduk, Robert
    COMPUTATIONAL SCIENCE - ICCS 2020, PT IV, 2020, 12140 : 128 - 140
  • [3] An efficient weighted Lagrangian twin support vector machine for imbalanced data classification
    Shao, Yuan-Hai
    Chen, Wei-Jie
    Zhang, Jing-Jing
    Wang, Zhen
    Deng, Nai-Yang
    PATTERN RECOGNITION, 2014, 47 (09) : 3158 - 3167
  • [4] Weighted support vector machine for extremely imbalanced data
    Mun, Jongmin
    Bang, Sungwan
    Kim, Jaeoh
    Computational Statistics and Data Analysis, 2025, 203
  • [5] Classification of Imbalanced Datasets using Partition Method and Support Vector Machine
    Awasare, Vinod Kumar
    Gupta, Surendra
    PROCEEDINGS OF THE 2017 IEEE SECOND INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES (ICECCT), 2017,
  • [6] Weighted support vector machine for classification
    Du, SX
    Chen, ST
    INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOL 1-4, PROCEEDINGS, 2005, : 3866 - 3871
  • [7] A novel twin-support vector machine for binary classification to imbalanced data
    Li, Jingyi
    Chao, Shiwei
    DATA TECHNOLOGIES AND APPLICATIONS, 2023, 57 (03) : 385 - 396
  • [8] Imbalanced classification using support vector machine ensemble
    Jiang Tian
    Hong Gu
    Wenqi Liu
    Neural Computing and Applications, 2011, 20 : 203 - 209
  • [9] Imbalanced classification using support vector machine ensemble
    Tian, Jiang
    Gu, Hong
    Liu, Wenqi
    NEURAL COMPUTING & APPLICATIONS, 2011, 20 (02): : 203 - 209
  • [10] Weighted support vector machine for data classification
    Yang, XL
    Song, Q
    Cao, AZ
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 859 - 864