Pattern-based bootstrapping framework for biomedical relation extraction

被引:11
|
作者
Deepika, S. S. [1 ]
Geetha, T. V. [1 ]
机构
[1] Anna Univ, Dept Comp Sci, CEG, Chennai, Tamil Nadu, India
关键词
Biomedical relation extraction; Pattern-based bootstrapping; Semi-supervised learning; Dependency parsing; Drug-target-disease;
D O I
10.1016/j.engappai.2020.104130
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The progress made in the realm of '-omics' technologies has led to a tremendous increase in the quantum of biomedical research published. Information extraction from this huge unstructured mass of data needs automation through text mining methods. Biomedical relation extraction is one such vital automation processes for extracting biomedical relations hidden in scientific literature. In the recent past, several supervised machine learning methods have been used to identify biomedical relations. However, given the variations in textual expression, huge corpus size and small task-specific training data, semi-supervised techniques appear to perform better. To this end, we propose a system that uses the semi-supervised bootstrapping algorithm to extract biomedical relations from text. The unlabelled corpus used contains sentences with biomedical entities represented as patterns with the dependency tree feature. Bootstrapping starts with a seed set and iteratively learns new patterns from the unlabelled corpus. We have designed a three-level masking technique to generate new patterns, and incorporated three types of scoring to help select appropriate patterns. The pattern-based bootstrapping method performs well with a minimum seed set. The system is able to extract 37,450 patterns from the unlabelled corpus that represents different biomedical relations. These patterns, in turn, are able to identify 460,886 relation pairs with 1327 single, and 1012 coupled, trigger words that convey the semantics of the biomedical relation. More than 64% of the identified relations have evidence in the CTD database.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A generalizable NLP framework for fast development of pattern-based biomedical relation extraction systems
    Yifan Peng
    Manabu Torii
    Cathy H Wu
    K Vijay-Shanker
    [J]. BMC Bioinformatics, 15
  • [2] A generalizable NLP framework for fast development of pattern-based biomedical relation extraction systems
    Peng, Yifan
    Torii, Manabu
    Wu, Cathy H.
    Vijay-Shanker, K.
    [J]. BMC BIOINFORMATICS, 2014, 15
  • [3] Metaheuristics Applied to Pattern-Based Portuguese Relation Extraction
    Manke, Luiz Felipe
    Coelho, Leandro dos Santos
    [J]. PROCEEDINGS OF 7TH INTERNATIONAL CONFERENCE ON HARMONY SEARCH, SOFT COMPUTING AND APPLICATIONS (ICHSA 2022), 2022, 140 : 149 - 158
  • [4] Simple tricks for improving pattern-based information extraction from the biomedical literature
    Quang Long Nguyen
    Domonkos Tikk
    Ulf Leser
    [J]. Journal of Biomedical Semantics, 1
  • [5] Simple tricks for improving pattern-based information extraction from the biomedical literature
    Quang Long Nguyen
    Tikk, Domonkos
    Leser, Ulf
    [J]. JOURNAL OF BIOMEDICAL SEMANTICS, 2010, 1
  • [6] Pattern-based Extraction of Disease Drug Combination Knowledge from Biomedical Literature
    Liu, Jing
    Abeysinghe, Rashmie
    Zheng, Fengbo
    Cui, Licong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019, : 196 - 202
  • [7] Pattern-based approaches to semantic relation extraction -: A state-of-the-art
    Auger, Alain
    Barriere, Caroline
    [J]. TERMINOLOGY, 2008, 14 (01): : 1 - 19
  • [8] Semi-Supervised Pattern-Based Algorithm for Arabic Relation Extraction
    Sarhan, Injy
    El-Sonbaty, Yasser
    Abou El-Nasr, Mohamed
    [J]. 2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 177 - 183
  • [9] A Pattern-Based Approach to Semantic Relation Extraction Using a Seed Ontology
    Al-Yahya, Maha
    Aldhubayi, Luluh
    Al-Malak, Sawsan
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2014, : 96 - 99
  • [10] Pattern-Based Synonym and Antonym Extraction
    Wang, Wenbo
    Thomas, Christopher
    Sheth, Amit
    Chan, Victor
    [J]. PROCEEDINGS OF THE 48TH ANNUAL SOUTHEAST REGIONAL CONFERENCE (ACM SE 10), 2010, : 320 - 323