Moara: a Java']Java library for extracting and normalizing gene and protein mentions

被引:16
|
作者
Neves, Mariana L. [1 ]
Carazo, Jose-Maria [1 ]
Pascual-Montano, Alberto [1 ,2 ]
机构
[1] CSIC, CNB, BioComp Unit, Natl Biotechnol Ctr, Madrid, Spain
[2] CSIC, IMMPA, Madrid, Spain
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
TEXT; NOMENCLATURE; LISTS;
D O I
10.1186/1471-2105-11-157
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Gene/protein recognition and normalization are important preliminary steps for many biological text mining tasks, such as information retrieval, protein-protein interactions, and extraction of semantic information, among others. Despite dedication to these problems and effective solutions being reported, easily integrated tools to perform these tasks are not readily available. Results: This study proposes a versatile and trainable Java library that implements gene/protein tagger and normalization steps based on machine learning approaches. The system has been trained for several model organisms and corpora but can be expanded to support new organisms and documents. Conclusions: Moara is a flexible, trainable and open-source system that is not specifically orientated to any organism and therefore does not requires specific tuning in the algorithms or dictionaries utilized. Moara can be used as a stand-alone application or can be incorporated in the workflow of a more general text mining system.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Moara: a Java library for extracting and normalizing gene and protein mentions
    Mariana L Neves
    José-María Carazo
    Alberto Pascual-Montano
    BMC Bioinformatics, 11
  • [2] Extracting and Normalizing Gene/Protein Mentions with the Flexible and Trainable Moara Java']Java Library
    Neves, Mariana L.
    Maria Carazo, Jose
    Pascual-Montano, Alberto
    LINKING LITERATURE, INFORMATION, AND KNOWLEDGE FOR BIOLOGY, 2010, 6004 : 71 - 80
  • [3] Extracting library-based Java']Java applications
    Tip, F
    Sweeney, PF
    Laffra, C
    COMMUNICATIONS OF THE ACM, 2003, 46 (08) : 35 - 40
  • [4] Extracting Java']Java library subsets for deployment on embedded systems
    Rayside, D
    Kontogiannis, K
    PROCEEDINGS OF THE THIRD EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING, 1999, : 102 - 110
  • [5] Extracting Java']Java library subsets for deployment on embedded systems
    Rayside, D
    Kontogiannis, K
    SCIENCE OF COMPUTER PROGRAMMING, 2002, 45 (2-3) : 245 - 270
  • [6] PIF - A Java']Java library for finding atomic interactions and extracting geometric features supporting the analysis of protein structures
    Jargielo, Weronika
    Malysiak-Mrozek, Bozena
    Mrozek, Dariusz
    METHODS, 2022, 205 : 63 - 72
  • [7] Development of a Java']Java Library for Protein Stability and Disorder Computations
    Sinha, Subrata
    Bora, Bishwajit
    Hazarika, G. C.
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS, 2015, 70 : 265 - 273
  • [8] An OpenMP Library for Java']Java
    Cook, Robert P.
    2013 PROCEEDINGS OF IEEE SOUTHEASTCON, 2013,
  • [9] NTTMUNSW BioC modules for recognizing and normalizing species and gene/protein mentions
    Dai, Hong-Jie
    Singh, Onkar
    Jonnagaddala, Jitendra
    Su, Emily Chia-Yu
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2016,
  • [10] Extensible numerical library in JAVA']JAVA
    Aso, T
    Okazawa, H
    Sasaki, T
    Takashimizu, N
    PROCEEDINGS OF CHEP 2001, 2001, : 536 - 539