Prediction of human disease-associated phosphorylation sites with combined feature selection approach and support vector machine

被引:11
|
作者
Xu, Xiaoyi [1 ]
Li, Ao [1 ,2 ]
Wang, Minghui [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, AH-230027 Hefei, Peoples R China
[2] Univ Sci & Technol China, Ctr Biomed Engn, AH-230027 Hefei, Peoples R China
基金
中国国家自然科学基金;
关键词
proteins; cellular biophysics; diseases; support vector machines; feature selection; filtering theory; medical computing; bioinformatics; forward feature selection process; minimum-redundancy-maximum-relevance filtering process; cellular process; post-translational modification; support vector machine; human disease-associated phosphorylation sites; PROTEIN-PHOSPHORYLATION; PATTERN-RECOGNITION; IDENTIFICATION; SEQUENCE;
D O I
10.1049/iet-syb.2014.0051
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Phosphorylation is a crucial post-translational modification, which regulates almost all cellular processes in life. It has long been recognised that protein phosphorylation has close relationship with diseases, and therefore many researches are undertaken to predict phosphorylation sites for disease treatment and drug design. However, despite the success achieved by these approaches, no method focuses on disease-associated phosphorylation sites prediction. Herein, for the first time the authors propose a novel approach that is specially designed to identify associations between phosphorylation sites and human diseases. To take full advantage of local sequence information, a combined feature selection method-based support vector machine (CFS-SVM) that incorporates minimum-redundancy-maximum-relevance filtering process and forward feature selection process is developed. Performance evaluation shows that CFS-SVM is significantly better than the widely used classifiers including Bayesian decision theory, k nearest neighbour and random forest. With the extremely high specificity of 99%, CFS-SVM can still achieve a high sensitivity. Besides, tests on extra data confirm the effectiveness and general applicability of CFS-SVM approach on a variety of diseases. Finally, the analysis of selected features and corresponding kinases also help the understanding of the potential mechanism of disease-phosphorylation relationships and guide further experimental validations.
引用
收藏
页码:155 / 163
页数:9
相关论文
共 50 条
  • [1] Prediction of Human Disease-specific Phosphorylation Sites with Combined Feature Selection Approach and Support Vector Machine
    Xu, Xiaoyi
    Li, Ao
    Wang, Minghui
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [2] A support vector machine approach to the identification of phosphorylation sites
    Plewczynski, D
    Tkacz, A
    Godzik, A
    Rychlewski, L
    CELLULAR & MOLECULAR BIOLOGY LETTERS, 2005, 10 (01) : 73 - 89
  • [3] Prediction of phosphorylation sites based on granular support vector machine
    Gong Cheng
    Qingfeng Chen
    Ruchang Zhang
    Granular Computing, 2021, 6 : 107 - 117
  • [4] Prediction of phosphorylation sites based on granular support vector machine
    Cheng, Gong
    Chen, Qingfeng
    Zhang, Ruchang
    GRANULAR COMPUTING, 2021, 6 (01) : 107 - 117
  • [5] Support Vector Machine with feature selection: A multiobjective approach
    Alcaraz, Javier
    Labbe, Martine
    Landete, Mercedes
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 204
  • [6] Prediction of Human Intestinal Absorption by GA Feature Selection and Support Vector Machine Regression
    Yan, Aixia
    Wang, Zhi
    Cai, Zongyuan
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2008, 9 (10) : 1961 - 1976
  • [7] A combined model based on feature selection and support vector machine for PM2.5 prediction
    Lai, Xiaocong
    Li, Hua
    Pan, Ying
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (05) : 10099 - 10113
  • [8] Prediction of the dissemination of health news on microblogging sites based on ample feature selection and support vector machine
    Pei J.
    Shan P.
    Revue d'Intelligence Artificielle, 2019, 33 (05) : 359 - 365
  • [9] Optimization Approach for Feature Selection and Classification with Support Vector Machine
    Chidambaram, S.
    Srinivasagan, K. G.
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 1, CIDM 2015, 2016, 410 : 103 - 111
  • [10] An optimized feature selection based on genetic approach and support vector machine for heart disease
    Gokulnath, Chandra Babu
    Shantharajah, S. P.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 6): : 14777 - 14787