Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics

被引:79
|
作者
Lin, Xiaohui [1 ]
Li, Chao [1 ]
Zhang, Yanhui [1 ]
Su, Benzhe [1 ]
Fan, Meng [1 ]
Wei, Hai [1 ]
机构
[1] Dalian Univ Technol, Sch Comp Sci & Technol, Dalian 116024, Peoples R China
来源
MOLECULES | 2018年 / 23卷 / 01期
基金
中国国家自然科学基金;
关键词
SVM-RFE; overlapping degree; feature selection; GENE SELECTION; CANCER CLASSIFICATION; BIOMARKER DISCOVERY; TUMOR; PREDICTION; SYSTEM;
D O I
10.3390/molecules23010052
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Feature reduction using SVM-RFE technique to detect autism spectrum disorder
    Priya Mohan
    Ilango Paramasivam
    Evolutionary Intelligence, 2021, 14 : 989 - 997
  • [32] SVM-RFE-ED: A Novel SVM-RFE based on Energy Distance for Gene Selection and Cancer Diagnosis
    Medjahed, Seyyid Ahmed
    Ouali, Mohammed
    COMPUTACION Y SISTEMAS, 2018, 22 (02): : 675 - 683
  • [33] SVM ensembles for selecting the relevant feature subsets
    Tao, B
    Abe, S
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 943 - 948
  • [34] Arabic Named Entity Recognition on Social Media based on feature selection techniques using SVM-RFE
    Ali, Brahim Ait Ben
    Mihi, Soukaina
    Bazi, Ismail El
    Laachfoubi, Nahil
    2020 FOURTH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES (ICDS), 2020,
  • [35] Feature selection and analysis of single lateral damper fault based on SVM-RFE with correlation bias reduction
    Tang Daochao
    Jin Weidong
    Qin Na
    Li Hui
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 3830 - 3835
  • [36] The Research of Algorithm for Protein Subcellular Localization Prediction Based on SVM-RFE
    Liu, Wenhao
    Zhai, Junjun
    Ding, Hongwei
    He, Xinlong
    2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,
  • [37] AdaBoost-based multiple SVM-RFE for classification of mammograms in DDSM
    Sejong Yoon
    Saejoon Kim
    BMC Medical Informatics and Decision Making, 9
  • [38] AdaBoost-Based Multiple SVM-RFE for Classification of Mammograms in DDSM
    Yoon, Sejong
    Kim, Saejoon
    2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, PROCEEDINGS, 2008, : 75 - 82
  • [39] AdaBoost-based multiple SVM-RFE for classification of mammograms in DDSM
    Yoon, Sejong
    Kim, Saejoon
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2009, 9
  • [40] Improved Automatic Filtering Algorithm for Imbalanced Classification based on SVM-RFE
    Li, Xiaoqiang
    Shao, Qing
    Wang, Jingjing
    2013 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2013,