Prediction of RNA-binding proteins from primary sequence by a support vector machine approach

被引:104
|
作者
Han, LY
Cai, CZ
Lo, SL
Chung, MCM
Chen, YZ
机构
[1] Natl Univ Singapore, Dept Computat Sci, Singapore 117543, Singapore
[2] Natl Univ Singapore, Dept Biochem, Singapore 117597, Singapore
[3] Chongqing Univ, Dept Appl Phys, Chongqing 400044, Peoples R China
关键词
RNA-binding proteins; RNA-protein interactions; rRNA; mRNA; tRNA; snRNA; support vector machine;
D O I
10.1261/rna.5890304
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Elucidation of the interaction of proteins with different molecules is of significance in the understanding of cellular processes. Computational methods have been developed for the prediction of protein-protein interactions. But insufficient attention has been paid to the prediction of protein-RNA interactions, which play central roles in regulating gene expression and certain RNA-mediated enzymatic processes. This work explored the use of a machine learning method, support vector machines (SVM), for the prediction of RNA-binding proteins directly from their primary sequence. Based on the knowledge of known RNA-binding and non-RNA-binding proteins, an SVM system was trained to recognize RNA-binding proteins. A total of 4011 RNA-binding and 9781 non-RNA-binding proteins was used to train and test the SVM classification system, and an independent set of 447 RNA-binding and 4881 non-RNA-binding proteins was used to evaluate the classification accuracy. Testing results using this independent evaluation set show a prediction accuracy of 94.1%, 79.3%, and 94.1% for rRNA-, mRNA-, and tRNA-binding proteins, and 98.7%, 96.5%, and 99.9% for non-rRNA-, non-mRNA-, and non-tRNA-binding proteins, respectively. The SVM classification system was further tested on a small class of snRNA-binding proteins with only 60 available sequences. The prediction accuracy is 40.0% and 99.9% for snRNA-binding and non-snRNA-binding proteins, indicating a need for a sufficient number of proteins to train SVM. The SVM classification systems trained in this work were added to our Web-based protein functional classification software SVMProt, at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi. Our study suggests the potential of SVM as a useful tool for facilitating the prediction of protein-RNA interactions.
引用
收藏
页码:355 / 368
页数:14
相关论文
共 50 条
  • [1] Prediction of transmembrane proteins from their primary sequence by support vector machine approach
    Cai, C. Z.
    Yuan, Q. F.
    Xiao, H. G.
    Liu, X. H.
    Han, L. Y.
    Chen, Y. Z.
    [J]. COMPUTATIONAL INTELLIGENCE AND BIOINFORMATICS, PT 3, PROCEEDINGS, 2006, 4115 : 525 - 533
  • [2] Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach
    HH Lin
    LY Han
    HL Zhang
    CJ Zheng
    B Xie
    ZW Cao
    YZ Chen
    [J]. BMC Bioinformatics, 7
  • [3] Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach
    Lin, H. H.
    Han, L. Y.
    Zhang, H. L.
    Zheng, C. J.
    Xie, B.
    Cao, Z. W.
    Chen, Y. Z.
    [J]. BMC BIOINFORMATICS, 2006, 7 (Suppl 5)
  • [4] Prediction of transporter family from protein sequence by support vector machine approach
    Lin, HH
    Han, LY
    Cai, CZ
    Ji, ZL
    Chen, YZ
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2006, 62 (01) : 218 - 231
  • [5] Prediction of RNA-Binding residues in protein sequences using support vector machines
    Wang, Liangjiang
    Brown, Susan J.
    [J]. 2006 28TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-15, 2006, : 2382 - +
  • [6] Computational Prediction of RNA-Binding Proteins and Binding Sites
    Si, Jingna
    Cui, Jing
    Cheng, Jin
    Wu, Rongling
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2015, 16 (11): : 26303 - 26317
  • [7] Prediction of RNA-binding residues in proteins from primary sequence using an enriched random forest model with a novel hybrid feature
    Ma, Xin
    Guo, Jing
    Wu, Jiansheng
    Liu, Hongde
    Yu, Jiafeng
    Xie, Jianming
    Sun, Xiao
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2011, 79 (04) : 1230 - 1239
  • [8] Prediction of RNA-Binding Proteins by Voting Systems
    Peng, C. R.
    Liu, L.
    Niu, B.
    Lv, Y. L.
    Li, M. J.
    Yuan, Y. L.
    Zhu, Y. B.
    Lu, W. C.
    Cai, Y. D.
    [J]. JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY, 2011,
  • [9] Refining the pool of RNA-binding domains advances the classification and prediction of RNA-binding proteins
    Wassmer, Elsa
    Koppany, Gergely
    Hermes, Malte
    Diederichs, Sven
    Caudron-Herger, Maiwen
    [J]. NUCLEIC ACIDS RESEARCH, 2024,
  • [10] RNA-binding proteins that lack canonical RNA-binding domains are rarely sequence-specific
    Ray, Debashish
    Laverty, Kaitlin U.
    Jolma, Arttu
    Nie, Kate
    Samson, Reuben
    Pour, Sara E.
    Tam, Cyrus L.
    von Krosigk, Niklas
    Nabeel-Shah, Syed
    Albu, Mihai
    Zheng, Hong
    Perron, Gabrielle
    Lee, Hyunmin
    Najafabadi, Hamed
    Blencowe, Benjamin
    Greenblatt, Jack
    Morris, Quaid
    Hughes, Timothy R.
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01)