Prediction of Protein-Protein Interaction Sites by Random Forest Algorithm with mRMR and IFS

被引:90
|
作者
Li, Bi-Qing [2 ,3 ]
Feng, Kai-Yan [4 ]
Chen, Lei [5 ]
Huang, Tao [2 ,3 ,6 ]
Cai, Yu-Dong [1 ]
机构
[1] Shanghai Univ, Inst Syst Biol, Shanghai, Peoples R China
[2] Chinese Acad Sci, Shanghai Inst Biol Sci, Key Lab Syst Biol, Shanghai, Peoples R China
[3] Shanghai Ctr Bioinformat Technol, Shanghai, Peoples R China
[4] Beijing Genom Inst, Shenzhen, Peoples R China
[5] Shanghai Maritime Univ, Coll Informat Engn, Shanghai, Peoples R China
[6] Mt Sinai Sch Med, Dept Genet & Genom Sci, New York, NY USA
来源
PLOS ONE | 2012年 / 7卷 / 08期
关键词
SECONDARY-STRUCTURE; SEQUENCE PROFILE; HOT-SPOTS; CLASSIFICATION; INTERFACES; PROGRAM; RESIDUE; IDENTIFICATION; INFORMATION; DATABASE;
D O I
10.1371/journal.pone.0043927
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Prediction of protein-protein interaction (PPI) sites is one of the most challenging problems in computational biology. Although great progress has been made by employing various machine learning approaches with numerous characteristic features, the problem is still far from being solved. In this study, we developed a novel predictor based on Random Forest (RF) algorithm with the Minimum Redundancy Maximal Relevance (mRMR) method followed by incremental feature selection (IFS). We incorporated features of physicochemical/biochemical properties, sequence conservation, residual disorder, secondary structure and solvent accessibility. We also included five 3D structural features to predict protein-protein interaction sites and achieved an overall accuracy of 0.672997 and MCC of 0.347977. Feature analysis showed that 3D structural features such as Depth Index (DPX) and surface curvature (SC) contributed most to the prediction of protein-protein interaction sites. It was also shown via site-specific feature analysis that the features of individual residues from PPI sites contribute most to the determination of protein-protein interaction sites. It is anticipated that our prediction method will become a useful tool for identifying PPI sites, and that the feature analysis described in this paper will provide useful insights into the mechanisms of interaction.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Prediction of protein-protein interactions using random decision forest framework
    Chen, XW
    Liu, M
    BIOINFORMATICS, 2005, 21 (24) : 4394 - 4400
  • [22] The Prediction of Calpain Cleavage Sites with the mRMR and IFS Approaches
    Zhang, Wenyi
    Xu, Xin
    Jia, Longjia
    Ma, Zhiqiang
    Luo, Na
    Wang, Jianan
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2013, 2013
  • [23] Prediction of Protein-Protein Interaction Sites Based on Naive Bayes Classifier
    Geng, Haijiang
    Lu, Tao
    Lin, Xiao
    Liu, Yu
    Yan, Fangrong
    BIOCHEMISTRY RESEARCH INTERNATIONAL, 2015, 2015
  • [24] Prediction of Protein-Protein Interaction Sites Based on Stratified Attentional Mechanisms
    Tang, Minli
    Wu, Longxin
    Yu, Xinyu
    Chu, Zhaoqi
    Jin, Shuting
    Liu, Juan
    FRONTIERS IN GENETICS, 2021, 12
  • [25] A novel feature extraction scheme for prediction of protein-protein interaction sites
    Du, Xiuquan
    Jing, Anqi
    Hu, Xinying
    MOLECULAR BIOSYSTEMS, 2015, 11 (02) : 475 - 485
  • [26] Prediction of Protein-Protein Interaction Sites Using Electrostatic Desolvation Profiles
    Fiorucci, Sebastien
    Zacharias, Martin
    BIOPHYSICAL JOURNAL, 2010, 98 (09) : 1921 - 1930
  • [27] Prediction of protein-protein interaction sites using support vector machines
    Minakuchi, Y
    Satou, K
    Konagaya, A
    METMBS'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MATHEMATICS AND ENGINEERING TECHNIQUES IN MEDICINE AND BIOLOGICAL SCIENCES, 2003, : 22 - 28
  • [28] Prediction of protein-protein interaction sites using support vector machines
    Koike, A
    Takagi, T
    PROTEIN ENGINEERING DESIGN & SELECTION, 2004, 17 (02): : 165 - 173
  • [29] Protein-protein interaction site prediction based on conditional random fields
    Li, Ming-Hui
    Lin, Lei
    Wang, Xiao-Long
    Liu, Tao
    BIOINFORMATICS, 2007, 23 (05) : 597 - 604
  • [30] Identification of Protein Interaction Partners and Protein-Protein Interaction Sites
    Sacquin-Mora, Sophie
    Carbone, Alessandra
    Lavery, Richard
    JOURNAL OF MOLECULAR BIOLOGY, 2008, 382 (05) : 1276 - 1289