deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes

被引:4
|
作者
Pesaranghader, Ahmad [1 ,2 ,3 ,4 ]
Matwin, Stan [5 ,6 ,7 ]
Sokolova, Marina [6 ,8 ,9 ]
Grenier, Jean-Christophe [1 ]
Beiko, Robert G. [5 ,6 ]
Hussin, Julie [1 ,2 ]
机构
[1] Montreal Heart Inst, Montreal, PQ H1T 1C8, Canada
[2] Univ Montreal, Fac Med, Montreal, PQ H3T 1J4, Canada
[3] Mila Quebec Artificial Intelligence Inst, Montreal, PQ H2S 3H1, Canada
[4] Univ Montreal, Dept Comp Sci & Operat Res, Montreal, PQ H3T 1J4, Canada
[5] Dalhousie Univ, Fac Comp Sci, Halifax, NS B3H 4R2, Canada
[6] Dalhousie Univ, Inst Big Data Analyt, Halifax, NS B3H 4R2, Canada
[7] Polish Acad Sci, Inst Comp Sci, Warsaw, Poland
[8] Univ Ottawa, Fac Med, Ottawa, ON K1H 8M5, Canada
[9] Univ Ottawa, Fac Engn, Ottawa, ON K1H 8M5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
SEMANTIC SIMILARITY; SHARED INFORMATION; NETWORK; PREDICTION; FEATURES;
D O I
10.1093/bioinformatics/btac304
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There is a plethora of measures to evaluate functional similarity (FS) of genes based on their co-expression, protein-protein interactions and sequence similarity. These measures are typically derived from hand-engineered and application-specific metrics to quantify the degree of shared information between two genes using their Gene Ontology (GO) annotations. Results: We introduce deepSimDEF, a deep learning method to automatically learn FS estimation of gene pairs given a set of genes and their GO annotations. deepSimDEF's key novelty is its ability to learn low-dimensional embedding vector representations of GO terms and gene products and then calculate FS using these learned vectors. We show that deepSimDEF can predict the FS of new genes using their annotations: it outperformed all other FS measures by >5-10% on yeast and human reference datasets on protein-protein interactions, gene co-expression and sequence homology tasks. Thus, deepSimDEF offers a powerful and adaptable deep neural architecture that can benefit a wide range of problems in genomics and proteomics, and its architecture is flexible enough to support its extension to any organism.
引用
收藏
页码:3051 / 3061
页数:11
相关论文
共 50 条
  • [1] Idiopathic male infertility is related with gametogenesis genes expression: results by a functional analysis of gene ontology terms
    Garrido, N.
    Martinez-Conejero, J. A.
    Jauregui, J.
    Sharma, R.
    Horcajadas, Ja
    Remohi, J.
    Pellicer, A.
    Meseguer, M.
    JOURNAL OF ANDROLOGY, 2007, : 48 - 48
  • [2] Annotation of gene products in the literature with gene ontology terms using syntactic dependencies
    Kim, JJ
    Park, JC
    NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 787 - 796
  • [3] simDEF: definition-based semantic similarity measure of gene ontology terms for functional similarity analysis of genes
    Pesaranghader, Ahmad
    Matwin, Stan
    Sokolova, Marina
    Beiko, Robert G.
    BIOINFORMATICS, 2016, 32 (09) : 1380 - 1387
  • [4] An improved method for functional similarity analysis of genes based on Gene Ontology
    Tian, Zhen
    Wang, Chunyu
    Guo, Maozu
    Liu, Xiaoyan
    Teng, Zhixia
    BMC SYSTEMS BIOLOGY, 2016, 10
  • [5] Spectral clustering gene ontology terms to group genes by function
    Speer, N
    Spieth, C
    Zell, A
    ALGORITHMS IN BIOINFORMATICS, PROCEEDINGS, 2005, 3692 : 1 - 12
  • [6] A new measure for functional similarity of gene products based on Gene Ontology
    Schlicker, Andreas
    Domingues, Francisco S.
    Rahnenfuehrer, Joerg
    Lengauer, Thomas
    BMC BIOINFORMATICS, 2006, 7 (1)
  • [7] A new measure for functional similarity of gene products based on Gene Ontology
    Andreas Schlicker
    Francisco S Domingues
    Jörg Rahnenführer
    Thomas Lengauer
    BMC Bioinformatics, 7
  • [8] Correlating Information Contents of Gene Ontology Terms to Infer Semantic Similarity of Gene Products
    Gan, Mingxin
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2014, 2014
  • [9] Defining functional distance using manifold embeddings of gene ontology annotations
    Lerman, Gilad
    Shakhnovich, Boris E.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (27) : 11334 - 11339
  • [10] Gene class expression: analysis tool of Gene Ontology terms with gene expression data
    Pereira, Gislaine S. P.
    Brandao, Rodrigo M.
    Giuliatti, Silvana
    Zago, Marco A.
    Silva, Wilson A., Jr.
    GENETICS AND MOLECULAR RESEARCH, 2006, 5 (01) : 108 - 114