Large-scale entity representation learning for biomedical relationship extraction

被引:7
|
作者
Saenger, Mario [1 ]
Leser, Ulf [1 ]
机构
[1] Humboldt Univ, Comp Sci Dept, Knowledge Management Bioinformat, D-10099 Berlin, Germany
关键词
TOOL;
D O I
10.1093/bioinformatics/btaa674
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The automatic extraction of published relationships between molecular entities has important applications in many biomedical fields, ranging from Systems Biology to Personalized Medicine. Existing works focused on extracting relationships described in single articles or in single sentences. However, a single record is rarely sufficient to judge upon the biological correctness of a relation, as experimental evidence might be weak or only valid in a certain context. Furthermore, statements may be more speculative than confirmative, and different articles often contradict each other. Experts therefore always take the complete literature into account to take a reliable decision upon a relationship. It is an open research question how to do this effectively in an automatic manner. Results: We propose two novel relation extraction approaches which use recent representation learning techniques to create comprehensive models of biomedical entities or entity-pairs, respectively. These representations are learned by considering all publications from PubMed mentioning an entity or a pair. They are used as input for a neural network for classifying relations globally, i.e. the derived predictions are corpus-based, not sentence- or article based as in prior art. Experiments on the extraction of mutation-disease, drug-disease and drug-drug relationships show that the learned embeddings indeed capture semantic information of the entities under study and outperform traditional methods by 4-29% regarding F1 score.
引用
收藏
页码:236 / 242
页数:7
相关论文
共 50 条
  • [21] A novel machine learning framework for automated biomedical relation extraction from large-scale literature repositories
    Lixiang Hong
    Jinjian Lin
    Shuya Li
    Fangping Wan
    Hui Yang
    Tao Jiang
    Dan Zhao
    Jianyang Zeng
    [J]. Nature Machine Intelligence, 2020, 2 : 347 - 355
  • [22] A novel machine learning framework for automated biomedical relation extraction from large-scale literature repositories
    Hong, Lixiang
    Lin, Jinjian
    Li, Shuya
    Wan, Fangping
    Yang, Hui
    Jiang, Tao
    Zhao, Dan
    Zeng, Jianyang
    [J]. NATURE MACHINE INTELLIGENCE, 2020, 2 (06) : 347 - +
  • [23] Large-Scale Collective Entity Matching
    Rastogi, Vibhor
    Dalvi, Nilesh
    Garofalakis, Minos
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (04): : 208 - 218
  • [24] Kernel-Based Autoencoders for Large-Scale Representation Learning
    Bao, Jinzhou
    Zhao, Bo
    Guo, Ping
    [J]. 2021 7TH INTERNATIONAL CONFERENCE ON ROBOTICS AND ARTIFICIAL INTELLIGENCE, ICRAI 2021, 2021, : 112 - 117
  • [25] PartNRL: Partial Nodes Representation Learning in Large-Scale Network
    Li, Juan-Hui
    Huang, Ling
    Wang, Chang-Dong
    Huang, Dong
    Lai, Jian-Huang
    [J]. IEEE ACCESS, 2019, 7 : 56457 - 56468
  • [26] Graph Representation Learning for Large-Scale Neuronal Morphological Analysis
    Zhao, Jie
    Chen, Xuejin
    Xiong, Zhiwei
    Zha, Zheng-Jun
    Wu, Feng
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 35 (04) : 5473 - 5487
  • [27] Neural Binary Representation Learning for Large-Scale Collaborative Filtering
    Zhang, Yujia
    Wu, Jun
    Wang, Haishuai
    [J]. IEEE ACCESS, 2019, 7 : 60752 - 60763
  • [28] A Decomposition Method for Large-Scale Sparse Coding in Representation Learning
    Li, Yifeng
    Caron, Richard J.
    Ngom, Alioune
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 3732 - 3738
  • [29] Active Learning Technique for Biomedical Named Entity Extraction
    Saha, Sriparna
    Ekbal, Asif
    Verma, Mridula
    Sikdar, Utpal
    Poesio, Massimo
    [J]. PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI'12), 2012, : 835 - 841
  • [30] Large-scale extraction of proteins
    Cunha, T
    Aires-Barros, R
    [J]. MOLECULAR BIOTECHNOLOGY, 2002, 20 (01) : 29 - 40