Cost-sensitive Feature Selection for Support Vector Machines

被引:35
|
作者
Benitez-Pena, S. [1 ,2 ]
Blanquero, R. [1 ,2 ]
Carrizosa, E. [1 ,2 ]
Ramirez-Cobo, P. [1 ,3 ]
机构
[1] Univ Seville, IMUS, E-41012 Seville, Spain
[2] Univ Seville, Dept Estadist & Invest Operat, E-41012 Seville, Spain
[3] Univ Cadiz, Dept Estadist & Invest Operat, Cadiz 11510, Spain
关键词
Classification; Data Science; Support Vector Machines; Feature Selection; Integer Programming; Sparsity; OPERATIONS-RESEARCH; CLASSIFICATION; SYNERGIES;
D O I
10.1016/j.cor.2018.03.005
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Feature Selection is a crucial procedure in Data Science tasks such as Classification, since it identifies the relevant variables, making thus the classification procedures more interpretable, cheaper in terms of measurement and more effective by reducing noise and data overfit. The relevance of features in a classification procedure is linked to the fact that misclassifications costs are frequently asymmetric, since false positive and false negative cases may have very different consequences. However, off-the-shelf Feature Selection procedures seldom take into account such cost-sensitivity of errors. In this paper we propose a mathematical-optimization-based Feature Selection procedure embedded in one of the most popular classification procedures, namely, Support Vector Machines, accommodating asymmetric misclassification costs. The key idea is to replace the traditional margin maximization by minimizing the number of features selected, but imposing upper bounds on the false positive and negative rates. The problem is written as an integer linear problem plus a quadratic convex problem for Support Vector Machines with both linear and radial kernels. The reported numerical experience demonstrates the usefulness of the proposed Feature Selection procedure. Indeed, our results on benchmark data sets show that a substantial decrease of the number of features is obtained, whilst the desired trade-off between false positive and false negative rates is achieved. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:169 / 178
页数:10
相关论文
共 50 条
  • [21] Cost-Sensitive Feature Selection by Optimizing F-Measures
    Liu, Meng
    Xu, Chang
    Luo, Yong
    Xu, Chao
    Wen, Yonggang
    Tao, Dacheng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (03) : 1323 - 1335
  • [22] Unsupervised Online Feature Selection for Cost-Sensitive Medical Diagnosis
    Verma, Arun
    Hanawal, Manjesh K.
    Hemachandra, Nandyala
    2020 INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS (COMSNETS), 2020,
  • [23] Cost-sensitive Support Vector Machine Based on Weighted Attribute
    Dai Yuanhong
    Chen Hongchang
    Peng Tao
    2009 INTERNATIONAL FORUM ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, 2009, : 690 - 692
  • [24] Seizure Prediction Using Cost-Sensitive Support Vector Machine
    Netoff, Theoden
    Park, Yun
    Parhi, Keshab
    2009 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-20, 2009, : 3322 - 3325
  • [25] Improving Classification with Cost-Sensitive Approach and Support Vector Machine
    Muntean, Maria
    Ileana, Ioan
    Rotar, Corina
    Valean, Honoriu
    9TH ROEDUNET IEEE INTERNATIONAL CONFERENCE, 2010, : 180 - +
  • [26] Cost-Sensitive Semi-Supervised Support Vector Machine
    Li, Yu-Feng
    Kwok, James T.
    Zhou, Zhi-Hua
    PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10), 2010, : 500 - 505
  • [27] Cost-based feature selection for Support Vector Machines: An application in credit scoring
    Maldonado, Sebastian
    Perez, Juan
    Bravo, Cristian
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2017, 261 (02) : 656 - 665
  • [28] Linear penalization support vector machines for feature selection
    Miranda, J
    Montoya, R
    Weber, R
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2005, 3776 : 188 - 192
  • [29] Comparison of Feature Selection Methods in Support Vector Machines
    Kim, Kwangsu
    Park, Changyi
    KOREAN JOURNAL OF APPLIED STATISTICS, 2013, 26 (01) : 131 - 139
  • [30] Feature Selection using Fuzzy Support Vector Machines
    Hong Xia
    Bao Qing Hu
    Fuzzy Optimization and Decision Making, 2006, 5 (2) : 187 - 192