Improving enzyme regulatory protein classification by means of SVM-RFE feature selection

被引:17
|
作者
Fernandez-Lozano, Carlos [1 ]
Fernandez-Blanco, Enrique [1 ]
Dave, Kirtan [2 ]
Pedreira, Nieves [1 ]
Gestal, Marcos [1 ]
Dorado, Julian [1 ]
Munteanu, Cristian R. [1 ]
机构
[1] Univ A Coruna, Dept Informat & Commun Technol, Fac Comp Sci, La Coruna 15071, Spain
[2] Sardar Patel Univ, GH Patel PG Dept Comp Sci & Technol, Vallabh Vidyanagar 388120, Gujarat, India
关键词
SUPPORT VECTOR MACHINES; COMPUTATIONAL CHEMISTRY; WEB SERVER; B INHIBITORS; MARCH-INSIDE; QSAR MODELS; 3D; RECOGNITION; DISCOVERY; DRUGS;
D O I
10.1039/c3mb70489k
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Enzyme regulation proteins are very important due to their involvement in many biological processes that sustain life. The complexity of these proteins, the impossibility of identifying direct quantification molecular properties associated with the regulation of enzymatic activities, and their structural diversity creates the necessity for new theoretical methods that can predict the enzyme regulatory function of new proteins. The current work presents the first classification model that predicts protein enzyme regulators using the Markov mean properties. These protein descriptors encode the topological information of the amino acid into contact networks based on amino acid distances and physicochemical properties. MInD-Prot software calculated these molecular descriptors for 2415 protein chains (350 enzyme regulators) using five atom physicochemical properties (Mulliken electronegativity, Kang-Jhon polarizability, vdW area, atom contribution to P) and the protein 3D regions. The best classification models to predict enzyme regulators have been obtained with machine learning algorithms from Weka using 18 features. K-star has been demonstrated to be the most accurate algorithm for this protein function classification. Wrapper Subset Evaluator and SVM-RFE approaches were used to perform a feature subset selection with the best results obtained from SVM-RFE. Classification performance employing all the available features can be reached using only the 8 most relevant features selected by SVM-RFE. Thus, the current work has demonstrated the possibility of predicting new molecular targets involved in enzyme regulation using fast theoretical algorithms.
引用
收藏
页码:1063 / 1071
页数:9
相关论文
共 50 条
  • [21] Nonlinear feature selection using Gaussian kernel SVM-RFE for fault diagnosis
    Yangtao Xue
    Li Zhang
    Bangjun Wang
    Zhao Zhang
    Fanzhang Li
    Applied Intelligence, 2018, 48 : 3306 - 3331
  • [22] Binary biogeography-based optimization based SVM-RFE for feature selection
    Albashish, Dheeb
    Hammouri, Abdelaziz, I
    Braik, Malik
    Atwan, Jaffar
    Sahran, Shahnorbanun
    APPLIED SOFT COMPUTING, 2021, 101
  • [23] A Hybrid Feature Selection Approach by Correlation-based Filters and SVM-RFE
    Zhang, Jing
    Hu, Xuegang
    Li, Peipei
    He, Wei
    Zhang, Yuhong
    Li, Huizong
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 3684 - 3689
  • [24] Nonlinear feature selection using Gaussian kernel SVM-RFE for fault diagnosis
    Xue, Yangtao
    Zhang, Li
    Wang, Bangjun
    Zhang, Zhao
    Li, Fanzhang
    APPLIED INTELLIGENCE, 2018, 48 (10) : 3306 - 3331
  • [25] Sparse and stable gene selection with consensus SVM-RFE
    Tapia, E.
    Bulacio, P.
    Angelone, L.
    PATTERN RECOGNITION LETTERS, 2012, 33 (02) : 164 - 172
  • [26] Classification of lip color based on multiple SVM-RFE
    Wang, Jingjing
    Li, Xiaoqiang
    Fan, Huafu
    Li, Fufeng
    2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, 2011, : 769 - 772
  • [27] SVM-RFE with relevancy and redundancy criteria for gene selection
    Mundra, Piyushkumar A.
    Rajapakse, Jagath C.
    PATTERN RECOGNITION IN BIOINFORMATICS, PROCEEDINGS, 2007, 4774 : 242 - 252
  • [28] Multiple SVM-RFE using Boosting for Mammogram Classification
    Yoon, Sejong
    Kim, Saejoon
    INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL SCIENCES AND OPTIMIZATION, VOL 1, PROCEEDINGS, 2009, : 740 - 742
  • [29] Multi-scoring Feature selection method based on SVM-RFE for prostate cancer diagnosis
    Albashish, Dheeb
    Sahran, Shahnorbanun
    Abdullah, Azizi
    Adam, Afzan
    Abd Shukor, Nordashima
    Pauzi, Suria Hayati Md
    5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATICS 2015, 2015, : 682 - 686
  • [30] An optimized SVM-RFE based feature selection and weighted entropy K-means approach for big data clustering in mapreduce
    Madan, Suman
    Komalavalli, C.
    Bhatia, Manjot Kaur
    Laroiya, Chetna
    Arora, Monika
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (30) : 74233 - 74254