Improving SVM Classification on Imbalanced Datasets by Introducing a New Bias

被引:28
|
作者
Nunez, Haydemar [1 ]
Gonzalez-Abril, Luis [2 ]
Angulo, Cecilio [3 ]
机构
[1] Univ Cent Venezuela, Fac Ciencias, Escuela Comp, Paseo Ilustres Caracas 1040, Venezuela
[2] Univ Seville, Seville, Spain
[3] Tech Univ Catalonia, Barcelona, Spain
关键词
Support Vector Machine; Post-processing; Bias; Cost-sensitive strategy: SMOTE; SUPPORT VECTOR MACHINES; SMOTE;
D O I
10.1007/s00357-017-9242-x
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Support Vector Machine (SVM) learning from imbalanced datasets, as well as most learning machines, can show poor performance on the minority class because SVMs were designed to induce a model based on the overall error. To improve their performance in these kind of problems, a low-cost post-processing strategy is proposed based on calculating a new bias to adjust the function learned by the SVM. The proposed bias will consider the proportional size between classes in order to improve performance on the minority class. This solution avoids not only introducing and tuning new parameters, but also modifying the standard optimization problem for SVM training. Experimental results on 34 datasets, with different degrees of imbalance, show that the proposed method actually improves the classification on imbalanced datasets, by using standardized error measures based on sensitivity and g-means. Furthermore, its performance is comparable to well-known cost-sensitive and Synthetic Minority Over-sampling Technique (SMOTE) schemes, without adding complexity or computational costs.
引用
收藏
页码:427 / 443
页数:17
相关论文
共 50 条
  • [41] Categorical classifiers in multiclass classification with imbalanced datasets
    Carpita, Maurizio
    Golia, Silvia
    STATISTICAL ANALYSIS AND DATA MINING-AN ASA DATA SCIENCE JOURNAL, 2023, 16 (04): : 391 - 405
  • [42] A-SMOTE: A New Preprocessing Approach for Highly Imbalanced Datasets by Improving SMOTE
    Hussein, Ahmed Saad
    Li, Tianrui
    Yohannese, Chubato Wondaferaw
    Bashir, Kamal
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2019, 12 (02) : 1412 - 1422
  • [43] Adaptive FH-SVM for Imbalanced Classification
    Wang, Qi
    Tian, Yingjie
    Liu, Dalian
    IEEE ACCESS, 2019, 7 : 130410 - 130422
  • [45] Classification of Imbalanced Datasets using One-Class SVM, k-Nearest Neighbors and CART Algorithm
    Ayyagari M.R.
    International Journal of Advanced Computer Science and Applications, 2020, 11 (11): : 1 - 5
  • [46] Combining integrated sampling with SVM ensembles for learning from imbalanced datasets
    Liu, Yang
    Yu, Xiaohui
    Huang, Jimmy Xiangji
    An, Aijun
    INFORMATION PROCESSING & MANAGEMENT, 2011, 47 (04) : 617 - 631
  • [47] An Effective Parallel SVM Intrusion Detection Model for Imbalanced Training Datasets
    Zhao, Jing
    Li, Jun
    Long, Chun
    Wei, Jinxia
    Du, Guanyao
    Wan, Wei
    Wang, Yue
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS), VOL 2, 2020, : 225 - 232
  • [48] Improving Prediction Accuracy for Logistic Regression On Imbalanced Datasets
    Zhang, Hao
    Li, Zhuolin
    Shahriar, Hossain
    Tao, Lixin
    Bhattacharya, Prabir
    Qian, Ying
    2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2019, : 918 - 919
  • [49] Improving Software Defect Prediction in Noisy Imbalanced Datasets
    Shi, Haoxiang
    Ai, Jun
    Liu, Jingyu
    Xu, Jiaxi
    APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [50] A new evaluation measure for imbalanced datasets
    School of Information Technologies, J12, University of Sydney, Sydney, NSW, 2006, Australia
    Conferences in Research and Practice in Information Technology Series, 2008, 87 : 27 - 32