Structured variable selection in support vector machines

被引:8
|
作者
Wu, Seongho [1 ]
Zou, Hui [1 ]
Yuan, Ming [2 ]
机构
[1] Univ Minnesota, Sch Stat, Minneapolis, MN 55455 USA
[2] Georgia Inst Technol, Sch Ind & Syst Engn, Atlanta, GA 30332 USA
来源
基金
美国国家科学基金会;
关键词
Classification; Heredity; Nonparametric estimation; Support vector machine; Variable selection;
D O I
10.1214/07-EJS125
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
When applying the support vector machine (SVM) to high-dimensional classification problems, we often impose a sparse structure in the SVM to eliminate the influences of the irrelevant predictors. The lasso and other variable selection techniques have been successfully used in the SVM to perform automatic variable selection. In some problems, there is a natural hierarchical structure among the variables. Thus, in order to have an interpretable SVM classifier, it is important to respect the heredity principle when enforcing the sparsity in the SVM. Many variable selection methods, however, do not respect the heredity principle. In this paper we enforce both sparsity and the heredity principle in the SVM by using the so-called structured variable selection (SVS) framework originally proposed in [20]. We minimize the empirical hinge loss under a set of linear inequality constraints and a lasso-type penalty. The solution always obeys the desired heredity principle and enjoys sparsity. The new SVM classifier can be efficiently fitted, because the optimization problem is a linear program. Another contribution of this work is to present a nonparametric extension of the SVS framework, and we propose nonparametric heredity SVMs. Simulated and real data are used to illustrate the merits of the proposed method.
引用
收藏
页码:103 / 117
页数:15
相关论文
共 50 条
  • [1] Variable Selection for Support Vector Machines
    Bierman, Surette
    Steel, Sarel
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2009, 38 (08) : 1640 - 1658
  • [2] An information criterion for variable selection in support vector machines
    Claeskens, Gerda
    Croux, Christophe
    Van Kerckhoven, Johan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2008, 9 : 541 - 558
  • [3] Kernel variable selection for multicategory support vector machines
    Park, Beomjin
    Park, Changyi
    JOURNAL OF MULTIVARIATE ANALYSIS, 2021, 186
  • [4] Variable selection for support vector machines in moderately high dimensions
    Zhang, Xiang
    Wu, Yichao
    Wang, Lan
    Li, Runze
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2016, 78 (01) : 53 - 76
  • [5] Extensional Ontology Matching with Variable Selection for Support Vector Machines
    Todorov, Konstantin
    Geibel, Peter
    Kuehnberger, Kai-Uwe
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS (CISIS 2010), 2010, : 962 - 967
  • [6] Class-specific variable selection for multicategory support vector machines
    Guo, Jian
    STATISTICS AND ITS INTERFACE, 2011, 4 (01) : 19 - 26
  • [7] Variable selection for support vector machines via smoothing spline ANOVA
    Zhang, Hao Helen
    STATISTICA SINICA, 2006, 16 (02) : 659 - 674
  • [8] Variable selection in proteomic profile classification by Interval support vector machines (iSVM)
    Yang, Xiaoli
    He, Huanyun
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 347 - 350
  • [9] Structured output prediction with Support Vector Machines
    Joachims, Thorsten
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS, 2006, 4109 : 1 - 7
  • [10] Predicting Structured Objects with Support Vector Machines
    Joachims, Thorsten
    Hofmann, Thomas
    Yue, Yisong
    Yu, Chun-Nam
    COMMUNICATIONS OF THE ACM, 2009, 52 (11) : 97 - 104