Integration of feature vector selection and support vector machine for classification of imbalanced data

被引:28
|
作者
Liu, Jie [1 ]
Zio, Enrico [2 ,3 ,4 ]
机构
[1] Beihang Univ, Sch Reliabil & Syst Engn, 37 Xueyuan Rd, Beijing, Peoples R China
[2] Politecn Milan, Energy Dept, Milan, Italy
[3] PSL Univ Paris, MINES ParisTech, Ctr Rech Risques & Crises CRC, Paris, France
[4] Kyung Hee Univ, Dept Nucl Engn, Seoul, South Korea
关键词
Classification; Feature Vector Selection; Imbalanced data; Support Vector Machine; Separability; CLASSIFIERS; RECOGNITION; MARGIN; MODEL; SVM;
D O I
10.1016/j.asoc.2018.11.045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Support Vector Machine (SVM) has been widely developed for tackling classification problems. Imbalanced data exist in many practical classification problems where the minority class is usually the one of interest. Undersampling is a popular solution for such problems. However, it has the risk of losing useful information in the original data. At the same time, tuning the hyperparameters in SVM is also challenging. By analyzing the geometrical meaning of kernel methods, an approach is proposed in this paper that combines a modified Feature Vector Selection (FVS) method with maximal between-class separability and an easy-tuning version of SVM, i.e. Feature Vector Regression (FVR) proposed in our previous work. In this paper, the modified FVS method selects a small number of data points that can represent linearly all the dataset in the Reproducing Kernel Hilbert Space (RKHS) and the selected data points give also a maximal separability of the imbalanced data in RKHS. The FVR model is also solved analytically, as in least-squared SVM. The decision threshold for classification is optimized to maximize the predefined accuracy metric. Twenty-six imbalanced datasets are considered and comparisons are carried out with several SVM-based methods for imbalanced data. Statistical test shows the effectiveness of the proposed method. (C) 2018 Elsevier B.V. All rights reserved.
引用
下载
收藏
页码:702 / 711
页数:10
相关论文
共 50 条
  • [1] Combine Sampling Support Vector Machine for Imbalanced Data Classification
    Sain, Hartayuni
    Purnami, Santi Wulan
    THIRD INFORMATION SYSTEMS INTERNATIONAL CONFERENCE 2015, 2015, 72 : 59 - 66
  • [2] Fuzzy Support Vector Machine for Microarray Imbalanced Data Classification
    Ladayya, Faroh
    Purnami, Santi Wulan
    Irhamah
    13TH IMT-GT INTERNATIONAL CONFERENCE ON MATHEMATICS, STATISTICS AND THEIR APPLICATIONS (ICMSA2017), 2017, 1905
  • [3] Automatic feature scaling and selection for support vector machine classification with functional data
    Asunción Jiménez-Cordero
    Sebastián Maldonado
    Applied Intelligence, 2021, 51 : 161 - 184
  • [4] NONPARAMETRIC FEATURE SELECTION AND SUPPORT VECTOR MACHINE FOR POLARIMETRIC SAR DATA CLASSIFICATION
    Maghsoudi, Yasser
    Collins, Michael
    Leckie, Donald G.
    2011 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2011, : 2857 - 2860
  • [5] Automatic feature scaling and selection for support vector machine classification with functional data
    Jimenez-Cordero, Asuncion
    Maldonado, Sebastian
    APPLIED INTELLIGENCE, 2021, 51 (01) : 161 - 184
  • [6] A feature selection Newton method for support vector machine classification
    Fung, GM
    Mangasarian, OL
    COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2004, 28 (02) : 185 - 202
  • [7] Optimization Approach for Feature Selection and Classification with Support Vector Machine
    Chidambaram, S.
    Srinivasagan, K. G.
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 1, CIDM 2015, 2016, 410 : 103 - 111
  • [8] A memetic algorithm with support vector machine for feature selection and classification
    Nekkaa, Messaouda
    Boughaci, Dalila
    MEMETIC COMPUTING, 2015, 7 (01) : 59 - 73
  • [9] Feature Selection for Cancer Classification Based on Support Vector Machine
    Luo, Wei
    Wang, Lipo
    Sun, Jingjing
    PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL IV, 2009, : 422 - +
  • [10] A memetic algorithm with support vector machine for feature selection and classification
    Messaouda Nekkaa
    Dalila Boughaci
    Memetic Computing, 2015, 7 : 59 - 73