Neighborhood based sample and feature selection for SVM classification learning

被引:51
|
作者
He, Qiang [1 ]
Xie, Zongxia
Hu, Qinghua [1 ]
Wu, Congxin [1 ]
机构
[1] Harbin Inst Technol, Dept Math, Harbin 150001, Peoples R China
关键词
Support vector machine; Rough set; Neighborhood relation; Sample selection; Feature selection; SUPPORT VECTOR MACHINES; ROUGH SETS; SYSTEMS;
D O I
10.1016/j.neucom.2011.01.019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Support vector machines (SVMs) are a class of popular classification algorithms for their high generalization ability. However, it is time-consuming to train SVMs with a large set of learning samples. Improving learning efficiency is one of most important research tasks on SVMs. It is known that although there are many candidate training samples in some learning tasks, only the samples near decision boundary which are called support vectors have impact on the optimal classification hyper-planes. Finding these samples and training SVMs with them will greatly decrease training time and space complexity. Based on the observation, we introduce neighborhood based rough set model to search boundary samples. Using the model, we firstly divide sample spaces into three subsets: positive region, boundary and noise. Furthermore, we partition the input features into four subsets: strongly relevant features, weakly relevant and indispensable features, weakly relevant and superfluous features, and irrelevant features. Then we train SVMs only with the boundary samples in the relevant and indispensable feature subspaces, thus feature and sample selection is simultaneously conducted with the proposed model. A set of experimental results show the model can select very few features and samples for training; in the mean time the classification performances are preserved or even improved. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:1585 / 1594
页数:10
相关论文
共 50 条
  • [1] Combined SVM-based feature selection and classification
    Neumann, J
    Schnörr, C
    Steidl, G
    [J]. MACHINE LEARNING, 2005, 61 (1-3) : 129 - 150
  • [2] Comparison of Feature Selection Approaches based on the SVM Classification
    Li, F. C.
    Chen, F. L.
    Wang, G. E.
    [J]. IEEM: 2008 INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1-3, 2008, : 400 - +
  • [3] Combined SVM-Based Feature Selection and Classification
    Julia Neumann
    Christoph Schnörr
    Gabriele Steidl
    [J]. Machine Learning, 2005, 61 : 129 - 150
  • [4] Feature Selection Based on the SVM Weight Vector for Classification of Dementia
    Bron, Esther E.
    Smits, Marion
    Niessen, Wiro J.
    Klein, Stefan
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2015, 19 (05) : 1617 - 1626
  • [5] Feature Selection Based on SVM Significance Maps for Classification of Dementia
    Bron, Esther
    Smits, Marion
    van Swieten, John
    Niessen, Wiro
    Klein, Stefan
    [J]. MACHINE LEARNING IN MEDICAL IMAGING (MLMI 2014), 2014, 8679 : 272 - 279
  • [6] A multiclass SVM classifier with teaching learning based feature subset selection for enzyme subclass classification
    Pradhan, Debasmita
    Sahoo, Biswajit
    Misra, Bijan Bihari
    Padhy, Sudarsan
    [J]. APPLIED SOFT COMPUTING, 2020, 96
  • [7] Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data
    Zhang, XG
    Lu, X
    Shi, Q
    Xu, XQ
    Leung, HCE
    Harris, LN
    D Iglehart, J
    Miron, A
    Liu, JS
    Wong, WH
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [8] Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data
    Xuegong Zhang
    Xin Lu
    Qian Shi
    Xiu-qin Xu
    Hon-chiu E Leung
    Lyndsay N Harris
    James D Iglehart
    Alexander Miron
    Jun S Liu
    Wing H Wong
    [J]. BMC Bioinformatics, 7
  • [9] Feature Selection for Classification of Hyperspectral Data by SVM
    Pal, Mahesh
    Foody, Giles M.
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2010, 48 (05): : 2297 - 2307
  • [10] Representative terrn based feature selection method for SVM based document classification
    Kang, YH
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT 1, PROCEEDINGS, 2005, 3681 : 56 - 61