deepSimDEF: deep neural embeddings of gene products and gene ontology terms for functional analysis of genes

被引:4
|
作者
Pesaranghader, Ahmad [1 ,2 ,3 ,4 ]
Matwin, Stan [5 ,6 ,7 ]
Sokolova, Marina [6 ,8 ,9 ]
Grenier, Jean-Christophe [1 ]
Beiko, Robert G. [5 ,6 ]
Hussin, Julie [1 ,2 ]
机构
[1] Montreal Heart Inst, Montreal, PQ H1T 1C8, Canada
[2] Univ Montreal, Fac Med, Montreal, PQ H3T 1J4, Canada
[3] Mila Quebec Artificial Intelligence Inst, Montreal, PQ H2S 3H1, Canada
[4] Univ Montreal, Dept Comp Sci & Operat Res, Montreal, PQ H3T 1J4, Canada
[5] Dalhousie Univ, Fac Comp Sci, Halifax, NS B3H 4R2, Canada
[6] Dalhousie Univ, Inst Big Data Analyt, Halifax, NS B3H 4R2, Canada
[7] Polish Acad Sci, Inst Comp Sci, Warsaw, Poland
[8] Univ Ottawa, Fac Med, Ottawa, ON K1H 8M5, Canada
[9] Univ Ottawa, Fac Engn, Ottawa, ON K1H 8M5, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
SEMANTIC SIMILARITY; SHARED INFORMATION; NETWORK; PREDICTION; FEATURES;
D O I
10.1093/bioinformatics/btac304
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There is a plethora of measures to evaluate functional similarity (FS) of genes based on their co-expression, protein-protein interactions and sequence similarity. These measures are typically derived from hand-engineered and application-specific metrics to quantify the degree of shared information between two genes using their Gene Ontology (GO) annotations. Results: We introduce deepSimDEF, a deep learning method to automatically learn FS estimation of gene pairs given a set of genes and their GO annotations. deepSimDEF's key novelty is its ability to learn low-dimensional embedding vector representations of GO terms and gene products and then calculate FS using these learned vectors. We show that deepSimDEF can predict the FS of new genes using their annotations: it outperformed all other FS measures by >5-10% on yeast and human reference datasets on protein-protein interactions, gene co-expression and sequence homology tasks. Thus, deepSimDEF offers a powerful and adaptable deep neural architecture that can benefit a wide range of problems in genomics and proteomics, and its architecture is flexible enough to support its extension to any organism.
引用
收藏
页码:3051 / 3061
页数:11
相关论文
共 50 条
  • [31] PROTEOMIC PROFILE AND FUNCTIONAL ENRICHMENT OF GENE ONTOLOGY TERMS IN MEN WITH TESTICULAR CANCER
    Tibaldi, D. S.
    Sposito, C.
    Del Giudice, P. T.
    Fariello, R. M.
    Spaine, D.
    Fraietta, R.
    FERTILITY AND STERILITY, 2011, 96 (03) : S206 - S206
  • [32] Linking molecular function and biological process terms in the gene ontology for gene expression data analysis
    DeJongh, M
    Van Dort, P
    Ramsay, B
    PROCEEDINGS OF THE 26TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2004, 26 : 2984 - 2986
  • [33] GO::TermFinder - open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes
    Boyle, EI
    Weng, SA
    Gollub, J
    Jin, H
    Botstein, D
    Cherry, JM
    Sherlock, G
    BIOINFORMATICS, 2004, 20 (18) : 3710 - 3715
  • [34] Functional module analysis of Alzheimer disease related genes and microRNAs based on Gene ontology annotation
    Zhang, Jie
    Li, Li
    Li, Xia
    Wang, Haiyun
    Li, Xia
    2009 3RD INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1-11, 2009, : 138 - +
  • [35] A memetic clustering algorithm for the functional partition of genes based on the gene ontology
    Speer, N
    Spieth, C
    Zell, A
    PROCEEDINGS OF THE 2004 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2004, : 252 - 259
  • [36] FatiGO:: a web tool for finding significant associations of Gene Ontology terms with groups of genes
    Al-Shahrour, F
    Díaz-Uriarte, R
    Dopazo, J
    BIOINFORMATICS, 2004, 20 (04) : 578 - 580
  • [37] The use of Gene Ontology terms and KEGG pathways for analysis and prediction of oncogenes
    Xing, Zhihao
    Chu, Chen
    Chen, Lei
    Kong, Xiangyin
    BIOCHIMICA ET BIOPHYSICA ACTA-GENERAL SUBJECTS, 2016, 1860 (11): : 2725 - 2734
  • [38] Statistical absolute evaluation of gene ontology terms with gene expression data
    Gupta, Pramod K.
    Yoshida, Ryo
    Imoto, Seiya
    Yamaguchi, Rui
    Miyano, Satoru
    BIOINFORMATICS RESEARCH AND APPLICATIONS, PROCEEDINGS, 2007, 4463 : 146 - +
  • [39] The Neural/Immune Gene Ontology: clipping the Gene Ontology for neurological and immunological systems
    Nophar Geifman
    Alon Monsonego
    Eitan Rubin
    BMC Bioinformatics, 11
  • [40] The Neural/Immune Gene Ontology: clipping the Gene Ontology for neurological and immunological systems
    Geifman, Nophar
    Monsonego, Alon
    Rubin, Eitan
    BMC BIOINFORMATICS, 2010, 11