Improving SVM Classification on Imbalanced Datasets by Introducing a New Bias

被引：28

作者：

Nunez, Haydemar ^{[1
]}

Gonzalez-Abril, Luis ^{[2
]}

Angulo, Cecilio ^{[3
]}

机构：

[1] Univ Cent Venezuela, Fac Ciencias, Escuela Comp, Paseo Ilustres Caracas 1040, Venezuela

[2] Univ Seville, Seville, Spain

[3] Tech Univ Catalonia, Barcelona, Spain

来源：

JOURNAL OF CLASSIFICATION | 2017年 / 34卷 / 03期

关键词：

Support Vector Machine; Post-processing; Bias; Cost-sensitive strategy: SMOTE; SUPPORT VECTOR MACHINES; SMOTE;

D O I：

10.1007/s00357-017-9242-x

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Support Vector Machine (SVM) learning from imbalanced datasets, as well as most learning machines, can show poor performance on the minority class because SVMs were designed to induce a model based on the overall error. To improve their performance in these kind of problems, a low-cost post-processing strategy is proposed based on calculating a new bias to adjust the function learned by the SVM. The proposed bias will consider the proportional size between classes in order to improve performance on the minority class. This solution avoids not only introducing and tuning new parameters, but also modifying the standard optimization problem for SVM training. Experimental results on 34 datasets, with different degrees of imbalance, show that the proposed method actually improves the classification on imbalanced datasets, by using standardized error measures based on sensitivity and g-means. Furthermore, its performance is comparable to well-known cost-sensitive and Synthetic Minority Over-sampling Technique (SMOTE) schemes, without adding complexity or computational costs.

引用

页码：427 / 443

页数：17

共 50 条

[31] A New Loss Function for Traffic Classification Task on Dramatic Imbalanced Datasets
Xu, Luyang
Zhou, Xu
Lin, Xifeng
Ren, Yongmao
Qin, Yifang
Liu, Jun
ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2020,
[32] Dealing with high-dimensional class-imbalanced datasets: Embedded feature selection for SVM classification
Maldonado, Sebastian
Lopez, Julio
APPLIED SOFT COMPUTING, 2018, 67 : 94 - 105
[33] INTRODUCING THREE NEW BENCHMARK DATASETS FOR HIERARCHICAL TEXT CLASSIFICATION
du Toit, Jaco
Redelinghuys, Herman
Dunaiski, Marcel
arXiv,
[34] Z-SVM: An SVM for improved classification of imbalanced data
Imam, Tasadduq
Ting, Kai Ming
Kamruzzaman, Joarder
AI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4304 : 264 - +
[35] Combination Approach of SMOTE and Biased-SVM for Imbalanced Datasets
Wang He-Yong
2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 228 - 231
[36] A robust loss function for classification with imbalanced datasets
Wang, Yidan
Yang, Liming
NEUROCOMPUTING, 2019, 331 : 40 - 49
[37] Imbalanced classification in sparse and large behaviour datasets
Jellis Vanhoeyveld
David Martens
Data Mining and Knowledge Discovery, 2018, 32 : 25 - 82
[38] FLSOM with Different Rates for Classification in Imbalanced Datasets
Machon-Gonzalez, Ivan
Lopez-Garcia, Hilario
ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT I, 2008, 5163 : 642 - 651
[39] Imbalanced classification in sparse and large behaviour datasets
Vanhoeyveld, Jellis
Martens, David
DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 32 (01) : 25 - 82
[40] A-SMOTE: A New Preprocessing Approach for Highly Imbalanced Datasets by Improving SMOTE
Ahmed Saad Hussein
Tianrui Li
Chubato Wondaferaw Yohannese
Kamal Bashir
International Journal of Computational Intelligence Systems, 2019, 12 : 1412 - 1422

← 1 2 3 4 5 →