Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing

被引:29
|
作者
Zhao, Yingwen [1 ]
Fu, Guangyuan [1 ]
Wang, Jun [1 ]
Guo, Maozu [2 ,3 ]
Yu, Guoxian [1 ]
机构
[1] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
[2] Beijing Univ Civil Engn & Architecture, Sch Elect & Informat Engn, Beijing 100044, Peoples R China
[3] Beijing Key Lab Intelligent Proc Bldg Big Data, Beijing 100044, Peoples R China
关键词
Gene Ontology; Gene function prediction; Hierarchy preserving hashing; Semantic similarity; PROTEIN FUNCTION; SIMILARITY; ANNOTATIONS; NETWORK; ASSOCIATIONS; SEQUENCE; FEATURES;
D O I
10.1016/j.ygeno.2018.02.008
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Gene Ontology (GO) uses structured vocabularies (or terms) to describe the molecular functions, biological roles, and cellular locations of gene products in a hierarchical ontology. GO annotations associate genes with GO terms and indicate the given gene products carrying out the biological functions described by the relevant terms. However, predicting correct GO annotations for genes from a massive set of GO terms as defined by GO is a difficult challenge. To combat with this challenge, we introduce a Gene Ontology Hierarchy Preserving Hashing (HPHash) based semantic method for gene function prediction. HPHash firstly measures the taxonomic similarity between GO terms. It then uses a hierarchy preserving hashing technique to keep the hierarchical order between GO terms, and to optimize a series of hashing functions to encode massive GO terms via compact binary codes. After that, HPHash utilizes these hashing functions to project the gene-term association matrix into a low-dimensional one and performs semantic similarity based gene function prediction in the low-dimensional space. Experimental results on three model species (Homo sapiens, Mus musculus and Rattus norvegicus) for interspecies gene function prediction show that HPHash performs better than other related approaches and it is robust to the number of hash functions. In addition, we also take HPHash as a plugin for BLAST based gene function prediction. From the experimental results, HPHash again significantly improves the prediction performance. The codes of HPHash are available at:http://mlda.swu.edu.cn/codes.php?name=HPHash.
引用
收藏
页码:334 / 342
页数:9
相关论文
共 50 条
  • [1] HashGO: hashing gene ontology for protein function prediction
    Yu, Guoxian
    Zhao, Yingwen
    Lu, Chang
    Wang, Jun
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2017, 71 : 264 - 273
  • [2] Gene Function Prediction Based on the Gene Ontology Hierarchical Structure
    Cheng, Liangxi
    Lin, Hongfei
    Hu, Yuncui
    Wang, Jian
    Yang, Zhihao
    [J]. PLOS ONE, 2014, 9 (09):
  • [3] Gene function prediction based on combining gene ontology hierarchy with multi-instance multi-label learning
    Li, Zejun
    Liao, Bo
    Li, Yun
    Liu, Wenhua
    Chen, Min
    Cai, Lijun
    [J]. RSC ADVANCES, 2018, 8 (50) : 28503 - 28509
  • [4] Gene function prediction with knowledge from gene ontology
    Shen, Ying
    Zhang, Lin
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 13 (01) : 50 - 62
  • [5] Applying Support Vector Machines for Gene ontology based gene function prediction
    Arunachalam Vinayagam
    Rainer König
    Jutta Moormann
    Falk Schubert
    Roland Eils
    Karl-Heinz Glatting
    Sándor Suhai
    [J]. BMC Bioinformatics, 5
  • [6] A Literature Review of Gene Function Prediction by Modeling Gene Ontology
    Zhao, Yingwen
    Wang, Jun
    Chen, Jian
    Zhang, Xiangliang
    Guo, Maozu
    Yu, Guoxian
    [J]. FRONTIERS IN GENETICS, 2020, 11
  • [7] Isoform function prediction by Gene Ontology embedding
    Qiu, Sichao
    Yu, Guoxian
    Lu, Xudong
    Domeniconi, Carlotta
    Guo, Maozu
    [J]. BIOINFORMATICS, 2022, 38 (19) : 4581 - 4588
  • [8] NMFGO: Gene Function Prediction via Nonnegative Matrix Factorization with Gene Ontology
    Yu, Guoxian
    Wang, Keyao
    Fu, Guangyuan
    Guo, Maozu
    Wang, Jun
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (01) : 238 - 249
  • [9] A new protein function prediction algorithm based on PPI network and the Gene Ontology
    Huang, Junheng
    Sun, Yushan
    Wang, Bailing
    Zhu, Dongjie
    [J]. Journal of Computational Information Systems, 2012, 8 (11): : 4545 - 4552
  • [10] In silico gene function prediction using ontology-based pattern identification
    Zhou, YY
    Young, JA
    Santrosyan, A
    Chen, KS
    Yan, SF
    Winzeler, EA
    [J]. BIOINFORMATICS, 2005, 21 (07) : 1237 - 1245