Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics

被引:79
|
作者
Lin, Xiaohui [1 ]
Li, Chao [1 ]
Zhang, Yanhui [1 ]
Su, Benzhe [1 ]
Fan, Meng [1 ]
Wei, Hai [1 ]
机构
[1] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian 116024, Peoples R China
来源
MOLECULES | 2018年 / 23卷 / 01期
基金
中国国家自然科学基金;
关键词
SVM-RFE; overlapping degree; feature selection; GENE SELECTION; CANCER CLASSIFICATION; BIOMARKER DISCOVERY; TUMOR; PREDICTION; SYSTEM;
D O I
10.3390/molecules23010052
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Characterization and analysis of HMMER and SVM-RFE - Parallel bioinformatics applications
    Srinivasan, U
    Chen, PS
    Diao, Q
    Lim, CC
    Li, E
    Chen, YJ
    Ju, R
    Zhang, YM
    IISWC - 2005: PROCEEDINGS OF THE 2005 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION, 2005, : 87 - 98
  • [2] A NEW FEATURE SELECTION METHOD BASED ON RELIEF AND SVM-RFE
    Fu Ruigang
    Wang Ping
    Gao Yinghui
    Hua Xiaoqiang
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 1363 - 1366
  • [3] Feature selection for tumor classification based on improved SVM-RFE
    Li, Hangeng
    Duan, Yanhua
    Li, Qingshou
    Ruan, Xiaogang
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, PROCEEDINGS, 2007, : 422 - 424
  • [4] Feature Selection for SNP Data Based on SVM-RFE and AGA
    Yang, Xutao
    Wu, Yue
    Jia, Min
    Lei, Zhou
    Liu, Zongtian
    2011 AASRI CONFERENCE ON APPLIED INFORMATION TECHNOLOGY (AASRI-AIT 2011), VOL 1, 2011, : 204 - 208
  • [5] Multiclass SVM-RFE for product form feature selection
    Shieh, Meng-Dar
    Yang, Chih-Chieh
    EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (1-2) : 531 - 541
  • [6] sEMG feature selection and classification using SVM-RFE
    Tosin, Mauricio C.
    Majolo, Mariano
    Chedid, Raissan
    Cene, Vinicius H.
    Balbinot, Alexandre
    2017 39TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2017, : 390 - 393
  • [7] Mapping of Soil pH Based on SVM-RFE Feature Selection Algorithm
    Guo, Jia
    Wang, Ku
    Jin, Shaofei
    AGRONOMY-BASEL, 2022, 12 (11):
  • [8] SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier
    Huang, Mei-Ling
    Hung, Yung-Hsiang
    Lee, W. M.
    Li, R. K.
    Jiang, Bo-Ru
    SCIENTIFIC WORLD JOURNAL, 2014,
  • [9] Binary biogeography-based optimization based SVM-RFE for feature selection
    Albashish, Dheeb
    Hammouri, Abdelaziz, I
    Braik, Malik
    Atwan, Jaffar
    Sahran, Shahnorbanun
    APPLIED SOFT COMPUTING, 2021, 101
  • [10] A Hybrid Feature Selection Based on Fisher Score and SVM-RFE for Microarray Data
    Hamla H.
    Ghanem K.
    Informatica (Slovenia), 2024, 48 (01): : 57 - 68