Prediction of RNA-binding proteins from primary sequence by a support vector machine approach

被引:104
|
作者
Han, LY
Cai, CZ
Lo, SL
Chung, MCM
Chen, YZ
机构
[1] Natl Univ Singapore, Dept Computat Sci, Singapore 117543, Singapore
[2] Natl Univ Singapore, Dept Biochem, Singapore 117597, Singapore
[3] Chongqing Univ, Dept Appl Phys, Chongqing 400044, Peoples R China
关键词
RNA-binding proteins; RNA-protein interactions; rRNA; mRNA; tRNA; snRNA; support vector machine;
D O I
10.1261/rna.5890304
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Elucidation of the interaction of proteins with different molecules is of significance in the understanding of cellular processes. Computational methods have been developed for the prediction of protein-protein interactions. But insufficient attention has been paid to the prediction of protein-RNA interactions, which play central roles in regulating gene expression and certain RNA-mediated enzymatic processes. This work explored the use of a machine learning method, support vector machines (SVM), for the prediction of RNA-binding proteins directly from their primary sequence. Based on the knowledge of known RNA-binding and non-RNA-binding proteins, an SVM system was trained to recognize RNA-binding proteins. A total of 4011 RNA-binding and 9781 non-RNA-binding proteins was used to train and test the SVM classification system, and an independent set of 447 RNA-binding and 4881 non-RNA-binding proteins was used to evaluate the classification accuracy. Testing results using this independent evaluation set show a prediction accuracy of 94.1%, 79.3%, and 94.1% for rRNA-, mRNA-, and tRNA-binding proteins, and 98.7%, 96.5%, and 99.9% for non-rRNA-, non-mRNA-, and non-tRNA-binding proteins, respectively. The SVM classification system was further tested on a small class of snRNA-binding proteins with only 60 available sequences. The prediction accuracy is 40.0% and 99.9% for snRNA-binding and non-snRNA-binding proteins, indicating a need for a sufficient number of proteins to train SVM. The SVM classification systems trained in this work were added to our Web-based protein functional classification software SVMProt, at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi. Our study suggests the potential of SVM as a useful tool for facilitating the prediction of protein-RNA interactions.
引用
收藏
页码:355 / 368
页数:14
相关论文
共 50 条
  • [41] RNA-binding proteins in pain
    Smith, Patrick R.
    Campbell, Zachary T.
    [J]. WILEY INTERDISCIPLINARY REVIEWS-RNA, 2024, 15 (02)
  • [42] Structures of RNA-binding proteins
    Arnez, JG
    Cavarelli, J
    [J]. QUARTERLY REVIEWS OF BIOPHYSICS, 1997, 30 (03) : 195 - 240
  • [43] Neurodegeneration and RNA-binding proteins
    De Conti, Laura
    Baralle, Marco
    Buratti, Emanuele
    [J]. WILEY INTERDISCIPLINARY REVIEWS-RNA, 2017, 8 (02)
  • [44] RNA-BINDING PROTEINS IN PROKARYOTES
    WOLSKA, KI
    [J]. ACTA MICROBIOLOGICA POLONICA, 1994, 43 (01): : 9 - 19
  • [45] PLANT RNA-BINDING PROTEINS
    GOODALL, G
    LEVY, J
    MIESZCZAK, M
    FILIPOWICZ, W
    [J]. MOLECULAR BIOLOGY REPORTS, 1990, 14 (2-3) : 137 - 137
  • [46] RNA-binding proteins tamed
    Laird-Offringa, IA
    Belasco, JG
    [J]. NATURE STRUCTURAL BIOLOGY, 1998, 5 (08) : 665 - 668
  • [47] RNA-binding proteins in bacteria
    Holmqvist, Erik
    Vogel, Joerg
    [J]. NATURE REVIEWS MICROBIOLOGY, 2018, 16 (10) : 601 - 615
  • [48] Chloroplast RNA-binding proteins
    Nickelsen, J
    [J]. CURRENT GENETICS, 2003, 43 (06) : 392 - 399
  • [49] NUCLEAR RNA-BINDING PROTEINS
    KEENE, JD
    QUERY, CC
    [J]. PROGRESS IN NUCLEIC ACID RESEARCH AND MOLECULAR BIOLOGY, 1991, 41 : 179 - 202
  • [50] Understanding RNA-binding proteins
    LLeonart, Matilde E.
    [J]. SEMINARS IN CANCER BIOLOGY, 2022, 86 : 135 - 136