Predicting Gene Ontology functions based on support vector machines and statistical significance estimation

被引:12
|
作者
Bi, Ran [1 ]
Zhou, Yanhong [1 ]
Lu, Feng [1 ]
Wang, Weiqiang [1 ]
机构
[1] Huazhong Univ Sci & Technol, Hubei Bioinformat & Mol Imaging Key Lab, Wuhan 430074, Hubei, Peoples R China
基金
中国国家自然科学基金;
关键词
protein function; Gene Ontology; support vector machines; statistical significance;
D O I
10.1016/j.neucom.2006.10.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gene Ontology (GO) is a common language for the functional annotation of gene products. We have developed a computational tool, GOKey, to predict the GO function of proteins based on their sequence features and the support vector machine (SVM) method. Several measures, including improved handling of the problem caused by unbalanced positive and negative training data and postprocessing strategies to evaluate the posterior probability and statistical significance of SVM outputs, have been adopted to improve the prediction performance of GOKey. The GOKey has been trained to predict the 36 GO categories of the 'molecular function' of GO slims, and could be easily extended to other GO categories. The results of 5-fold cross validation with 10,603 GO-mapped proteins demonstrate that the performance of GOKey is better than that of standard SVMs. Comparisons with other computational tools for GO function prediction also show that the performance of GOKey is satisfactory. Further, GOKey has been applied to predict the GO functions for 5381 novel human proteins in the Ensembl database. The results show that 93% of the novel proteins can be assigned one or more GO terms, and some evidences supporting the predictions have been found. GOKey can be accessed at http://infosci.hust.edu.cn. (c) 2006 Published by Elsevier B.V.
引用
收藏
页码:718 / 725
页数:8
相关论文
共 50 条
  • [21] Learning of kernel functions in support vector machines
    Yang, Chih-Cheng
    Lee, Wan-Jui
    Lee, Shie-Jue
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 1150 - +
  • [22] Autocorrelation kernel functions for support vector machines
    Kong, Rui
    Zhang, Bing
    ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 1, PROCEEDINGS, 2007, : 512 - +
  • [23] Support vector machines based on convex risk functions and general norms
    Gotoh, Jun-ya
    Uryasev, Stan
    ANNALS OF OPERATIONS RESEARCH, 2017, 249 (1-2) : 301 - 328
  • [24] Support vector machines based on convex risk functions and general norms
    Jun-ya Gotoh
    Stan Uryasev
    Annals of Operations Research, 2017, 249 : 301 - 328
  • [25] Extensional Ontology Matching with Variable Selection for Support Vector Machines
    Todorov, Konstantin
    Geibel, Peter
    Kuehnberger, Kai-Uwe
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPLEX, INTELLIGENT AND SOFTWARE INTENSIVE SYSTEMS (CISIS 2010), 2010, : 962 - 967
  • [26] Development of Pedotransfer Functions for Estimation of Soil Hydraulic Parameters using Support Vector Machines
    Twarakavi, Navin K. C.
    Simunek, Jirka
    Schaap, M. G.
    SOIL SCIENCE SOCIETY OF AMERICA JOURNAL, 2009, 73 (05) : 1443 - 1452
  • [27] Multiclass Probability Estimation With Support Vector Machines
    Wang, Xin
    Zhang, Hao Helen
    Wu, Yichao
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2019, 28 (03) : 586 - 595
  • [28] Support vector machines for the estimation of aqueous solubility
    Lind, P
    Maltseva, T
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (06): : 1855 - 1859
  • [29] Model Selection for Support Vector Machines Based on Kernel Density Estimation
    Jin, Zhu
    Ma, Xiaoping
    2010 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-5, 2010, : 1161 - 1165
  • [30] Statistical Properties and Adaptive Tuning of Support Vector Machines
    Yi Lin
    Grace Wahba
    Hao Zhang
    Yoonkyung Lee
    Machine Learning, 2002, 48 : 115 - 136