PEDL: extracting protein-protein associations using deep language models and distant supervision

被引:6
|
作者
Weber, Leon [1 ,2 ]
Thobe, Kirsten [2 ]
Lozano, Oscar Arturo Migueles [2 ]
Wolf, Jana [2 ]
Leser, Ulf [1 ]
机构
[1] Humboldt Univ, Comp Sci Dept, D-10099 Berlin, Germany
[2] Max Delbruck Ctr Mol Med, Grp Math Modelling Cellular Proc, Helmholtz Assoc, D-13125 Berlin, Germany
关键词
CYCLOOXYGENASE-2; EXPRESSION; NETWORK; COMPLEX; CELLS;
D O I
10.1093/bioinformatics/btaa430
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A significant portion of molecular biology investigates signalling pathways and thus depends on an up-to-date and complete resource of functional protein-protein associations (PPAs) that constitute such pathways. Despite extensive curation efforts, major pathway databases are still notoriously incomplete. Relation extraction can help to gather such pathway information from biomedical publications. Current methods for extracting PPAs typically rely exclusively on rare manually labelled data which severely limits their performance. Results: We propose PPA Extraction with Deep Language (PEDL), a method for predicting PPAs from text that combines deep language models and distant supervision. Due to the reliance on distant supervision, PEDL has access to an order of magnitude more training data than methods solely relying on manually labelled annotations. We introduce three different datasets for PPA prediction and evaluate PEDL for the two subtasks of predicting PPAs between two proteins, as well as identifying the text spans stating the PPA. We compared PEDL with a recently published state-of-the-art model and found that on average PEDL performs better in both tasks on all three datasets. An expert evaluation demonstrates that PEDL can be used to predict PPAs that are missing from major pathway databases and that it correctly identifies the text spans supporting the PPA.
引用
收藏
页码:490 / 498
页数:9
相关论文
共 50 条
  • [41] Neutral evolution of protein-protein interactions: a computational study using simple models
    Noirel, Josselin
    Simonson, Thomas
    BMC STRUCTURAL BIOLOGY, 2007, 7
  • [42] Technology Development for Studying the Dynamics of Protein-protein Associations on Chromatin
    Tackett, Alan
    FASEB JOURNAL, 2010, 24
  • [43] Imprinting unique motifs formed from protein-protein associations
    Rick, J
    Chou, TC
    ANALYTICA CHIMICA ACTA, 2005, 542 (01) : 26 - 31
  • [44] Heavy path mining of protein-protein associations in the malaria parasite
    Yu, Xinran
    Korkmaz, Turgay
    Lilburn, Timothy G.
    Cai, Hong
    Gu, Jianying
    Wang, Yufeng
    METHODS, 2015, 83 : 63 - 70
  • [45] Prediction of protein-protein binding affinity using diverse protein-protein interface features
    Ma, Duo
    Guo, Yanzhi
    Luo, Jiesi
    Pu, Xuemei
    Li, Menglong
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2014, 138 : 7 - 13
  • [47] Prediction of protein-protein binding affinity using diverse protein-protein interface features
    Ma, Duo
    Guo, Yanzhi
    Luo, Jiesi
    Pu, Xuemei
    Li, Menglong
    Chemometrics and Intelligent Laboratory Systems, 2014, 138 : 7 - 13
  • [48] SpatialPPIv2: Enhancing protein-protein interaction prediction through graph neural networks with protein language models
    Hu, Wenxing
    Ohue, Masahito
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2025, 27 : 508 - 518
  • [49] Extraction of protein-protein interactions using natural language processing based pattern matching
    Yu, Kaixian
    Zhao, Tingting
    Zhao, Peixiang
    Zhang, Jinfeng
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 1292 - 1295
  • [50] Revolutionizing protein-protein interaction prediction with deep learning
    Zhang, Jing
    Durham, Jesse
    Cong, Qian
    CURRENT OPINION IN STRUCTURAL BIOLOGY, 2024, 85