A boosting approach for prediction of protein-RNA binding residues

被引:34
|
作者
Tang, Yongjun [1 ,2 ,3 ]
Liu, Diwei [4 ]
Wang, Zixiang [4 ]
Wen, Ting [4 ]
Deng, Lei [4 ]
机构
[1] Cent South Univ, Xiangya Hosp, Dept Clin Pharmacol, 87 Xiangya Rd, Changsha 410008, Hunan, Peoples R China
[2] Cent South Univ, Hunan Key Lab Pharmacogenet, Inst Clin Pharmacol, 87 Xiangya Rd, Changsha 410008, Hunan, Peoples R China
[3] Cent South Univ, Xiangya Hosp, Dept Pediat, 87 Xiangya Rd, Changsha 410008, Hunan, Peoples R China
[4] Cent South Univ, Sch Software, 22 Shaoshan South Rd, Changsha 410075, Hunan, Peoples R China
来源
BMC BIOINFORMATICS | 2017年 / 18卷
基金
中国国家自然科学基金;
关键词
RNA-binding residue; Gradient tree boosting; Structural neighborhood features; INTERACTION HOT-SPOTS; SOLVENT ACCESSIBILITY; SITES; RECOGNITION; IDENTIFICATION; NUCLEOTIDES; ANNOTATION; GENERATION; IMPROVES; SVM;
D O I
10.1186/s12859-017-1879-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: RNA binding proteins play important roles in post-transcriptional RNA processing and transcriptional regulation. Distinguishing the RNA-binding residues in proteins is crucial for understanding how protein and RNA recognize each other and function together as a complex. Results: We propose PredRBR, an effectively computational approach to predict RNA-binding residues. PredRBR is built with gradient tree boosting and an optimal feature set selected from a large number of sequence and structure characteristics and two categories of structural neighborhood properties. In cross-validation experiments on the RBP170 data set show that PredRBR achieves an overall accuracy of 0.84, a sensitivity of 0.85, MCC of 0.55 and AUC of 0.92, which are significantly better than that of other widely used machine learning algorithms such as Support Vector Machine, Random Forest, and Adaboost. We further calculate the feature importance of different feature categories and find that structural neighborhood characteristics are critical in the recognization of RNA binding residues. Also, PredRBR yields significantly better prediction accuracy on an independent test set (RBP101) in comparison with other state-of-the-art methods. Conclusions: The superior performance over existing RNA-binding residue prediction methods indicates the importance of the gradient tree boosting algorithm combined with the optimal selected features.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Transfer RNA binding to human serum albumin: A model for protein-RNA interaction
    Malonga, Herman
    Neault, Jean-Francois
    Tajmir-Riahi, Heidar-Ali
    [J]. DNA AND CELL BIOLOGY, 2006, 25 (07) : 393 - 398
  • [32] XPredRBR: Accurate and Fast Prediction of RNA-Binding Residues in Proteins Using eXtreme Gradient Boosting
    Deng, Lei
    Dong, Zuojin
    Liu, Hui
    [J]. BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2018, 2018, 10847 : 163 - 173
  • [33] Protein-protein and protein-RNA binding domains of ACF are required for complementation of ApoB RNA editing
    Blanc, V
    Henderson, J
    Kennedy, S
    Davidson, N
    [J]. GASTROENTEROLOGY, 2002, 122 (04) : A406 - A406
  • [34] PROTEIN-RNA RECOGNITION
    DRAPER, DE
    [J]. ANNUAL REVIEW OF BIOCHEMISTRY, 1995, 64 : 593 - 620
  • [35] Prediction of protein-RNA interactions using sequence and structure descriptors
    Liu, Zhi-Ping
    Miao, Hongyu
    [J]. NEUROCOMPUTING, 2016, 206 : 28 - 34
  • [36] PiRaNhA: a server for the computational prediction of RNA-binding residues in protein sequences
    Murakami, Yoichi
    Spriggs, Ruth V.
    Nakamura, Haruki
    Jones, Susan
    [J]. NUCLEIC ACIDS RESEARCH, 2010, 38 : W412 - W416
  • [37] Protein-RNA tethering: The role of poly(C) binding protein 2 in poliovirus RNA replication
    Spear, Allyn
    Sharma, Nidhi
    Flanegan, James Bert
    [J]. VIROLOGY, 2008, 374 (02) : 280 - 291
  • [38] Prediction of Protein-RNA Interactions Using Sequence and Structure Descriptors
    Lin, Zhi-Ping
    Miao, Hongyu
    [J]. IFAC PAPERSONLINE, 2015, 48 (28): : 1 - 6
  • [39] Protein-RNA interaction prediction with deep learning: structure matters
    Wei, Junkang
    Chen, Siyuan
    Zong, Licheng
    Gao, Xin
    Li, Yu
    [J]. BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [40] Protein-RNA recognition
    De Guzman, RN
    Turner, RB
    Summers, MF
    [J]. BIOPOLYMERS, 1998, 48 (2-3) : 181 - 195