Cost-sensitive Feature Selection for Support Vector Machines

被引:33
|
作者
Benitez-Pena, S. [1 ,2 ]
Blanquero, R. [1 ,2 ]
Carrizosa, E. [1 ,2 ]
Ramirez-Cobo, P. [1 ,3 ]
机构
[1] Univ Seville, IMUS, E-41012 Seville, Spain
[2] Univ Seville, Dept Estadist & Invest Operat, E-41012 Seville, Spain
[3] Univ Cadiz, Dept Estadist & Invest Operat, Cadiz 11510, Spain
关键词
Classification; Data Science; Support Vector Machines; Feature Selection; Integer Programming; Sparsity; OPERATIONS-RESEARCH; CLASSIFICATION; SYNERGIES;
D O I
10.1016/j.cor.2018.03.005
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Feature Selection is a crucial procedure in Data Science tasks such as Classification, since it identifies the relevant variables, making thus the classification procedures more interpretable, cheaper in terms of measurement and more effective by reducing noise and data overfit. The relevance of features in a classification procedure is linked to the fact that misclassifications costs are frequently asymmetric, since false positive and false negative cases may have very different consequences. However, off-the-shelf Feature Selection procedures seldom take into account such cost-sensitivity of errors. In this paper we propose a mathematical-optimization-based Feature Selection procedure embedded in one of the most popular classification procedures, namely, Support Vector Machines, accommodating asymmetric misclassification costs. The key idea is to replace the traditional margin maximization by minimizing the number of features selected, but imposing upper bounds on the false positive and negative rates. The problem is written as an integer linear problem plus a quadratic convex problem for Support Vector Machines with both linear and radial kernels. The reported numerical experience demonstrates the usefulness of the proposed Feature Selection procedure. Indeed, our results on benchmark data sets show that a substantial decrease of the number of features is obtained, whilst the desired trade-off between false positive and false negative rates is achieved. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:169 / 178
页数:10
相关论文
共 50 条
  • [1] Cost-sensitive support vector machines
    Iranmehr, Arya
    Masnadi-Shirazi, Hamed
    Vasconcelos, Nuno
    [J]. NEUROCOMPUTING, 2019, 343 : 50 - 64
  • [2] Cost-sensitive probabilistic predictions for support vector machines
    Benitez-Pena, Sandra
    Blanquero, Rafael
    Carrizosa, Emilio
    Ramirez-Cobo, Pepa
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 314 (01) : 268 - 279
  • [3] Face detection based on cost-sensitive support vector machines
    Ma, Y
    Ding, XQ
    [J]. PATTERN RECOGNITON WITH SUPPORT VECTOR MACHINES, PROCEEDINGS, 2002, 2388 : 260 - 267
  • [4] An evaluation of discrete support vector machines for cost-sensitive learning
    Lessmann, Stefan
    Crone, Sven F.
    Stahlbock, Robert
    Zacher, Nikolaus
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 347 - +
  • [5] Cost-Sensitive Feature Selection on Heterogeneous Data
    Qian, Wenbin
    Shu, Wenhao
    Yang, Jun
    Wang, Yinglong
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 397 - 408
  • [6] Feature selection for support vector machines
    Hermes, L
    Buhmann, JM
    [J]. 15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS: PATTERN RECOGNITION AND NEURAL NETWORKS, 2000, : 712 - 715
  • [7] Seizure prediction with spectral power of EEG using cost-sensitive support vector machines
    Park, Yun
    Luo, Lan
    Parhi, Keshab K.
    Netoff, Theoden
    [J]. EPILEPSIA, 2011, 52 (10) : 1761 - 1770
  • [8] Cost sensitive support vector machines
    Zheng, En-Hui
    Li, Ping
    Song, Zhi-Huan
    [J]. Kongzhi yu Juece/Control and Decision, 2006, 21 (04): : 473 - 476
  • [9] Cost-Sensitive Feature Selection for Class Imbalance Problem
    Bach, Malgorzata
    Werner, Aleksandra
    [J]. INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, PT I, 2018, 655 : 182 - 194
  • [10] Cost-sensitive ensemble of support vector machines for effective detection of microcalcification in breast cancer diagnosis
    Peng, YH
    Huang, Q
    Jiang, P
    Jiang, JM
    [J]. FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 483 - 493