Novel symmetry-based gene-gene dissimilarity measures utilizing Gene Ontology: Application in gene clustering

被引:5
|
作者
Acharya, Sudipta [1 ]
Saha, Sriparna [1 ]
Pradhan, Prasanna [2 ]
机构
[1] IIT Patna, Dept Comp Sci & Engn, Dealpur Daulat, Bihar, India
[2] Sikkim Manipal Inst Technol, Dept Comp Applicat, Majitar, Sikkim, India
关键词
Gene Ontology(GO); Dissimilarity measure; Symmetry-based distance; Gene clustering; Gene-GO term annotation matrix; Multi-objective clustering; SEMANTIC SIMILARITY; MULTIOBJECTIVE OPTIMIZATION; FUNCTIONAL-ANALYSIS; CLASSIFICATION; EXPRESSION; ALGORITHM; CANCER;
D O I
10.1016/j.gene.2018.08.062
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
In recent years DNA microarray technology, leading to the generation of high-volume biological data, has gained significant attention. To analyze this high volume gene-expression data, one such powerful tool is Clustering. For any clustering algorithm, its efficiency majorly depends upon the underlying similarity/dissimilarity measure. During the analysis of such data often there is a need to further explore the similarity of genes not only with respect to their expression values but also with respect to their functional annotations, which can be obtained from Gene Ontology (GO) databases. In the existing literature, several novel clustering and bi-clustering approaches were proposed to identify co-regulated genes from gene-expression datasets. Identifying co-regulated genes from gene expression data misses some important biological information about functionalities of genes, which is necessary to identify semantically related genes. In this paper, we have proposed sixteen different semantic gene-gene dissimilarity measures utilizing biological information of genes retrieved from a global biological database namely Gene Ontology (GO). Four proximity measures, viz. Euclidean, Cosine, point symmetry and line symmetry are utilized along with four different representations of gene-GO-term annotation vectors to develop total sixteen gene-gene dissimilarity measures. In order to illustrate the profitability of developed dissimilarity measures, some multi-objective as well as single-objective clustering algorithms are applied utilizing proposed measures to identify functionally similar genes from Mouse genome and Yeast datasets. Furthermore, we have compared the performance of our proposed sixteen dissimilarity measures with three existing state-of-the-art semantic similarity and distance measures.
引用
收藏
页码:341 / 351
页数:11
相关论文
共 50 条
  • [1] Multi-Factored Gene-Gene Proximity Measures Exploiting Biological Knowledge Extracted from Gene Ontology: Application in Gene Clustering
    Acharya, Sudipta
    Saha, Sriparna
    Pradhan, Prasanna
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (01) : 207 - 219
  • [2] Neighborhood-based clustering of gene-gene interactions
    Diaz-Diaz, Norberto
    Rodriguez-Baena, Domingo S.
    Nepomuceno, Isabel
    Aguilar-Ruiz, Jesus S.
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2006, PROCEEDINGS, 2006, 4224 : 1111 - 1120
  • [3] Gene-Ontology-based clustering of gene expression data
    Adryan, B
    Schuh, R
    [J]. BIOINFORMATICS, 2004, 20 (16) : 2851 - 2852
  • [4] A Critical Look at Entropy-Based Gene-Gene Interaction Measures
    Lee, Woojoo
    Sjolander, Arvid
    Pawitan, Yudi
    [J]. GENETIC EPIDEMIOLOGY, 2016, 40 (05) : 416 - 424
  • [5] On a Gene-based Test for Gene-Gene Interaction Using Similarity Measures Between Individuals
    Mukhopadhyay, Indranil
    [J]. GENETIC EPIDEMIOLOGY, 2012, 36 (07) : 753 - 754
  • [6] Principal interactions analysis for repeated measures data: application to gene-gene and gene-environment interactions
    Mukherjee, Bhramar
    Ko, Yi-An
    VanderWeele, Tyler
    Roy, Anindya
    Park, Sung Kyun
    Chen, Jinbo
    [J]. STATISTICS IN MEDICINE, 2012, 31 (22) : 2531 - 2551
  • [7] Wilks' Λ Dissimilarity Measures for Gene Clustering: An Approach Based on the Identification of Transcription Modules
    Roverato, Alberto
    Di Lascio, F. Marta L.
    [J]. BIOMETRICS, 2011, 67 (04) : 1236 - 1248
  • [8] GEIRA: gene-environment and gene-gene interaction research application
    Ding, Bo
    Kallberg, Henrik
    Klareskog, Lars
    Padyukov, Leonid
    Alfredsson, Lars
    [J]. EUROPEAN JOURNAL OF EPIDEMIOLOGY, 2011, 26 (07) : 557 - 561
  • [9] Analyzing Gene-Based Gene-Gene Interactions with R
    Emily, M.
    Sounac, N.
    Kroell, F.
    [J]. HUMAN HEREDITY, 2015, 80 (03) : 109 - 109
  • [10] Incorporating gene ontology in clustering gene expression data
    Kustra, Rafal
    Zagdanski, Adam
    [J]. 19TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2006, : 555 - +