Multi-Factored Gene-Gene Proximity Measures Exploiting Biological Knowledge Extracted from Gene Ontology: Application in Gene Clustering

被引:4
|
作者
Acharya, Sudipta [1 ]
Saha, Sriparna [1 ]
Pradhan, Prasanna [2 ]
机构
[1] Indian Inst Technol Patna, Dept Comp Sci & Engn, Patna 801103, Bihar, India
[2] Sikkim Manipal Inst Technol, Dept Comp Applicat, Rangpo 737132, Sikkim, India
关键词
Semantics; Integrated circuits; Bioinformatics; Ontologies; Tools; Genomics; Current measurement; Gene ontology (GO); gene clustering; semantic similarity; distance measure; gene-gene similarity matrix; multi-objective clustering; SEMANTIC SIMILARITY; CLASSIFICATION; EXPRESSION; ALGORITHM; CANCER; TOOL;
D O I
10.1109/TCBB.2018.2849362
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
To describe the cellular functions of proteins and genes, a potential dynamic vocabulary is Gene Ontology (GO), which comprises of three sub-ontologies namely, Biological-process, Cellular-component, and Molecular-function. It has several applications in the field of bioinformatics like annotating/measuring gene-gene or protein-protein semantic similarity, identifying genes/proteins by their GO annotations for disease gene and target discovery, etc. To determine semantic similarity between genes, several semantic measures have been proposed in literature, which involve information content of GO-terms, GO tree structure, or the combination of both. But, most of the existing semantic similarity measures do not consider different topological and information theoretic aspects of GO-terms collectively. Inspired by this fact, in this article, we have first proposed three novel semantic similarity/distance measures for genes covering different aspects of GO-tree. These are further implanted in the frameworks of well-known multi-objective and single-objective based clustering algorithms to determine functionally similar genes. For comparative analysis, 10 popular existing GO based semantic similarity/distance measures and tools are also considered. Experimental results on Mouse genome, Yeast, and Human genome datasets evidently demonstrate the supremacy of multi-objective clustering algorithms in association with proposed multi-factored similarity/distance measures. Clustering outcomes are further validated by conducting some biological/statistical significance tests. Supplementary information is available at https://www.iitp.ac.in/sriparna/journals.html.
引用
收藏
页码:207 / 219
页数:13
相关论文
共 50 条
  • [31] Inference from clustering with application to gene-expression microarrays
    Dougherty, ER
    Barrera, J
    Brun, M
    Kim, S
    Cesar, RM
    Chen, YD
    Bittner, M
    Trent, JM
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (01) : 105 - 126
  • [32] Knowledge-Driven Analysis Identifies a Gene-Gene Interaction Affecting High-Density Lipoprotein Cholesterol Levels in Multi-Ethnic Populations
    Ma, Li
    Brautbar, Ariel
    Boerwinkle, Eric
    Sing, Charles F.
    Clark, Andrew G.
    Keinan, Alon
    [J]. PLOS GENETICS, 2012, 8 (05):
  • [33] From Alternative Clustering to Robust Clustering and Its Application to Gene Expression Data
    Peng, Peter
    Nagi, Mohamad
    Sair, Omer
    Suleiman, Iyad
    Qabaja, Ala
    ElSheikh, Abdallah M.
    Gao, Shang
    Ozyer, Tansel
    Kianmehr, Keivan
    Naji, Ghada
    Ridley, Mick
    Rokne, Jon
    Alhajj, Reda
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2011, 2011, 6936 : 421 - +
  • [34] AGGLO-Hi clustering algorithm for gene expression micro array data using proximity measures
    Kavitha, E.
    Tamilarasan, R.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (13-14) : 9003 - 9017
  • [35] A new unsupervised gene clustering algorithm based on the integration of biological knowledge into expression data
    Verbanck, Marie
    Le, Sebastien
    Pages, Jerome
    [J]. BMC BIOINFORMATICS, 2013, 14
  • [36] A joint optimization framework integrated with biological knowledge for clustering incomplete gene expression data
    Dan Li
    Hong Gu
    Qiaozhen Chang
    Jia Wang
    Pan Qin
    [J]. Soft Computing, 2023, 27 : 13639 - 13656
  • [37] A joint optimization framework integrated with biological knowledge for clustering incomplete gene expression data
    Li, Dan
    Gu, Hong
    Chang, Qiaozhen
    Wang, Jia
    Qin, Pan
    [J]. SOFT COMPUTING, 2023, 27 (18) : 13639 - 13656
  • [38] A new unsupervised gene clustering algorithm based on the integration of biological knowledge into expression data
    Marie Verbanck
    Sébastien Lê
    Jérôme Pagès
    [J]. BMC Bioinformatics, 14
  • [39] GOmir: a stand-alone application for human microRNA target analysis and gene ontology clustering
    Zotos, Pantelis
    Papachristoudis, Georgios
    Roubelakis, Maria G.
    Michalopoulos, Ioannis
    Pappa, Kalliopi I.
    Anagnou, Nikolaos P.
    Kossida, Sophia
    [J]. 8TH IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING, VOLS 1 AND 2, 2008, : 272 - +
  • [40] Application of Multi-SOM clustering approach to macrophage gene expression analysis
    Ghouila, Amel
    Ben Yahia, Sadok
    Malouche, Dhafer
    Jmel, Haifa
    Laouini, Dhafer
    Guerfali, Fatma Z.
    Abdelhak, Sonia
    [J]. INFECTION GENETICS AND EVOLUTION, 2009, 9 (03) : 328 - 336