Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach

被引:10
|
作者
Ratkovic, Zorana [1 ,2 ]
Golik, Wiktoria [1 ]
Warnier, Pierre [1 ,3 ]
机构
[1] MIG INRA UR1077 Domaine Vilvert, F-78352 Jouy En Josas, France
[2] Univ Paris 03, CNRS, UMR 8094, LaTTiCe, F-92120 Montrouge, France
[3] Univ Grenoble 1, LIG, F-38400 St Martin Dheres, France
来源
BMC BIOINFORMATICS | 2012年 / 13卷
关键词
TEXT;
D O I
10.1186/1471-2105-13-S11-S8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. Methods: We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. Results: We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. Conclusions: We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data.
引用
下载
收藏
页数:11
相关论文
共 50 条
  • [31] Cognitively inspired NLP-based knowledge representations: Further explorations of Latent semantic analysis
    Louwerse, Max
    Cai, Zhiqiang
    Hu, Xiangen
    Ventura, Matthew
    Jeuniaux, Patrick
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2006, 15 (06) : 1021 - 1039
  • [32] Wiki-based knowledge sharing in a knowledge-intensive organization
    Wuhan University, Center for Studies of Information Resources, Wuhan
    430072, China
    IFIP Advances in Information and Communication Technology, 2007, (18-25)
  • [33] NLP-Based Approach for Identifying Quality Risk Factors in Steel Structure Construction
    Zhao, Yuhong
    Zhang, Jingyi
    Mu, Enyi
    Buildings, 2024, 14 (11)
  • [34] NLP-Based Approach to Detect Autism Spectrum Disorder in Saccadic Eye Movement
    Elbattah, Mahmoud
    Guerin, Jean-Luc
    Carette, Romuald
    Cilia, Federica
    Dequen, Gilles
    2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, : 1581 - 1587
  • [35] A NLP-based Approach to Improve Speech Recognition Services for People with Speech Disorders
    Celesti, Antonio
    Fazio, Maria
    Carnevale, Lorenzo
    Villari, Massimo
    2022 27TH IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (IEEE ISCC 2022), 2022,
  • [36] Towards a systematic approach for capturing knowledge-intensive business processes
    Trier, M
    Müller, C
    PRACTICAL ASPECTS OF KNOWLEDGE MANAGEMENT, PROCEEDINGS, 2004, 3336 : 239 - 250
  • [37] Predicting Programming Behavior in OSS Communities: A Case Study of NLP-based Approach
    Huo, Manyan
    Yu, Yue
    Li, Zhixing
    Chang, Junsheng
    2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER ENGINEERING (ICAICE 2020), 2020, : 430 - 439
  • [38] Mapping Partners Master Drug Dictionary to RxNorm using an NLP-based approach
    Zhou, Li
    Plasek, Joseph M.
    Mahoney, Lisa M.
    Chang, Frank Y.
    DiMaggio, Dana
    Rocha, Roberto A.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2012, 45 (04) : 626 - 633
  • [39] NLP-Based Approach to Semantic Classification of Heterogeneous Transportation Asset Data Terminology
    Le, Tuyen
    Jeong, H. David
    JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2017, 31 (06)
  • [40] A Simple NLP-based Approach to Support Onboarding and Retention in Open Source Communities
    Stanik, Christoph
    Montgomery, Lloyd
    Martens, Daniel
    Fucci, Davide
    Maalej, Walid
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2018, : 172 - 182