Explaining protein-protein interactions with knowledge graph-based semantic similarity

被引:3
|
作者
Sousa, Rita T. [1 ]
Silva, Sara [1 ]
Pesquita, Catia [1 ]
机构
[1] Univ Lisbon, LASIGE, Fac Ciencias, Lisbon, Portugal
关键词
Machine learning; Explainable artificial intelligence; Knowledge graph; Semantic similarity; Protein-protein interaction prediction;
D O I
10.1016/j.compbiomed.2024.108076
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The application of artificial intelligence and machine learning methods for several biomedical applications, such as protein-protein interaction prediction, has gained significant traction in recent decades. However, explainability is a key aspect of using machine learning as a tool for scientific discovery. Explainable artificial intelligence approaches help clarify algorithmic mechanisms and identify potential bias in the data. Given the complexity of the biomedical domain, explanations should be grounded in domain knowledge which can be achieved by using ontologies and knowledge graphs. These knowledge graphs express knowledge about a domain by capturing different perspectives of the representation of real -world entities. However, the most popular way to explore knowledge graphs with machine learning is through using embeddings, which are not explainable. As an alternative, knowledge graph -based semantic similarity offers the advantage of being explainable. Additionally, similarity can be computed to capture different semantic aspects within the knowledge graph and increasing the explainability of predictive approaches. We propose a novel method to generate explainable vector representations, KGsim2vec, that uses aspectoriented semantic similarity features to represent pairs of entities in a knowledge graph. Our approach employs a set of machine learning models, including decision trees, genetic programming, random forest and eXtreme gradient boosting, to predict relations between entities. The experiments reveal that considering multiple semantic aspects when representing the similarity between two entities improves explainability and predictive performance. KGsim2vec performs better than black -box methods based on knowledge graph embeddings or graph neural networks. Moreover, KGsim2vec produces global models that can capture biological phenomena and elucidate data biases.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Graph-based prediction of Protein-protein interactions with attributed signed graph embedding
    Yang, Fang
    Fan, Kunjie
    Song, Dandan
    Lin, Huakang
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [2] Graph-based prediction of Protein-protein interactions with attributed signed graph embedding
    Fang Yang
    Kunjie Fan
    Dandan Song
    Huakang Lin
    BMC Bioinformatics, 21
  • [3] Assessing protein-protein interactions based on the semantic similarity of interacting proteins
    Cui, Guangyu
    Kim, Byungmin
    Alguwaizani, Saud
    Han, Kyungsook
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 13 (01) : 75 - 83
  • [4] A Graph-Based Approach for Protein-Protein Docking
    Zhang, Tao
    Peng, QunSheng
    Chen, Wei
    Wu, Tao
    Chen, Xin
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS, VOLS 1-4, 2009, : 764 - +
  • [5] GO semantic similarity-based false positive reduction of protein-protein interactions
    Wang, Jianxing
    Dai, Lijian
    Li, Min
    2009 INTERNATIONAL JOINT CONFERENCE ON BIOINFORMATICS, SYSTEMS BIOLOGY AND INTELLIGENT COMPUTING, PROCEEDINGS, 2009, : 211 - 214
  • [6] GDockScore: a graph-based protein-protein docking scoring function
    McFee, Matthew
    Kim, Philip M.
    NEURO-ONCOLOGY ADVANCES, 2023, 5 (01)
  • [7] A Graph-Based Approach for Finding the Dengue Infection Pathways in Humans Using Protein-Protein Interactions
    Dey, Lopamudra
    Mukhopadhyay, Anirban
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2020, 27 (05) : 755 - 768
  • [8] Discovering novel protein-protein interactions by measuring the protein semantic similarity from the biomedical literature
    Chiang, Jung-Hsien
    Ju, Jiun-Huang
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2014, 12 (06)
  • [9] Selection of GO-Based Semantic Similarity Measures through AMDE for Predicting Protein-Protein Interactions
    Mukhopadhyay, Anirban
    De, Moumita
    Maulik, Ujjwal
    SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING, PT II, 2011, 7077 : 55 - +
  • [10] DDMut-PPI: predicting effects of mutations on protein-protein interactions using graph-based deep learning
    Zhou, Yunzhuo
    Myung, Yoochan
    Rodrigues, Carlos H. M.
    Ascher, David B.
    NUCLEIC ACIDS RESEARCH, 2024, 52 (W1) : W207 - W214