Explaining protein-protein interactions with knowledge graph-based semantic similarity

被引:3
|
作者
Sousa, Rita T. [1 ]
Silva, Sara [1 ]
Pesquita, Catia [1 ]
机构
[1] Univ Lisbon, LASIGE, Fac Ciencias, Lisbon, Portugal
关键词
Machine learning; Explainable artificial intelligence; Knowledge graph; Semantic similarity; Protein-protein interaction prediction;
D O I
10.1016/j.compbiomed.2024.108076
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The application of artificial intelligence and machine learning methods for several biomedical applications, such as protein-protein interaction prediction, has gained significant traction in recent decades. However, explainability is a key aspect of using machine learning as a tool for scientific discovery. Explainable artificial intelligence approaches help clarify algorithmic mechanisms and identify potential bias in the data. Given the complexity of the biomedical domain, explanations should be grounded in domain knowledge which can be achieved by using ontologies and knowledge graphs. These knowledge graphs express knowledge about a domain by capturing different perspectives of the representation of real -world entities. However, the most popular way to explore knowledge graphs with machine learning is through using embeddings, which are not explainable. As an alternative, knowledge graph -based semantic similarity offers the advantage of being explainable. Additionally, similarity can be computed to capture different semantic aspects within the knowledge graph and increasing the explainability of predictive approaches. We propose a novel method to generate explainable vector representations, KGsim2vec, that uses aspectoriented semantic similarity features to represent pairs of entities in a knowledge graph. Our approach employs a set of machine learning models, including decision trees, genetic programming, random forest and eXtreme gradient boosting, to predict relations between entities. The experiments reveal that considering multiple semantic aspects when representing the similarity between two entities improves explainability and predictive performance. KGsim2vec performs better than black -box methods based on knowledge graph embeddings or graph neural networks. Moreover, KGsim2vec produces global models that can capture biological phenomena and elucidate data biases.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] A GRAPH-BASED SEMANTIC SIMILARITY MEASURE FOR THE GENE ONTOLOGY
    Alvarez, Marco A.
    Yan, Changhui
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2011, 9 (06) : 681 - 695
  • [22] Graph-Based Spatial Proximity of Super-Resolved Protein-Protein Interactions Predicts Cancer Drug Responses in Single Cells
    Zhang, Nicholas
    Cai, Shuangyi
    Wang, Mingshuang
    Hu, Thomas
    Schneider, Frank
    Sun, Shi-Yong
    Coskun, Ahmet F.
    CELLULAR AND MOLECULAR BIOENGINEERING, 2024, 17 (05) : 467 - 490
  • [23] An iterative knowledge-based scoring function for protein-protein interactions
    Huang, Shengyou
    Zou, Xiaoqin
    BIOPHYSICAL JOURNAL, 2007, : 221A - 221A
  • [24] An End-to-End Knowledge Graph Fused Graph Neural Network for Accurate Protein-Protein Interactions Prediction
    Yang, Jie
    Li, Yapeng
    Wang, Guoyin
    Chen, Zhong
    Wu, Di
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2024, 21 (06) : 2518 - 2530
  • [25] Protein-protein interaction based on pairwise similarity
    Nazar Zaki
    Sanja Lazarova-Molnar
    Wassim El-Hajj
    Piers Campbell
    BMC Bioinformatics, 10
  • [26] Protein-protein interaction based on pairwise similarity
    Zaki, Nazar
    Lazarova-Molnar, Sanja
    El-Hajj, Wassim
    Campbell, Piers
    BMC BIOINFORMATICS, 2009, 10
  • [27] GO semantic similarity based analysis for huaman protein interactions
    Chen, Gang
    Wang, Jianxin
    Li, Min
    2009 INTERNATIONAL JOINT CONFERENCE ON BIOINFORMATICS, SYSTEMS BIOLOGY AND INTELLIGENT COMPUTING, PROCEEDINGS, 2009, : 207 - 210
  • [28] BI-GRAPPIN: Bipartite GRAph based protein-protein interaction networks similarity search
    Fionda, Valeria
    Palopoli, Luigi
    Panni, Simona
    Rombo, Simona E.
    2007 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, PROCEEDINGS, 2007, : 355 - +
  • [29] Graph Based Automatic Protein Function Annotation Improved by Semantic Similarity
    Sarker, Bishnu
    Khare, Navya
    Devignes, Marie-Dominique
    Aridhi, Sabeur
    BIOINFORMATICS AND BIOMEDICAL ENGINEERING (IWBBIO 2020), 2020, 12108 : 261 - 272
  • [30] Struct2Graph: a graph attention network for structure based predictions of protein-protein interactions
    Baranwal, Mayank
    Magner, Abram
    Saldinger, Jacob
    Turali-Emre, Emine S.
    Elvati, Paolo
    Kozarekar, Shivani
    VanEpps, J. Scott
    Kotov, Nicholas A.
    Violi, Angela
    Hero, Alfred O.
    BMC BIOINFORMATICS, 2022, 23 (01)