A text-mining system for extracting metabolic reactions from full-text articles

被引:21
|
作者
Czarnecki, Jan [1 ,2 ]
Nobeli, Irene [1 ,2 ]
Smith, Adrian M. [3 ]
Shepherd, Adrian J. [1 ,2 ]
机构
[1] Univ London, Dept Biol Sci, London WC1E 7HX, England
[2] Univ London, Inst Mol & Struct Biol, London WC1E 7HX, England
[3] Unilever R&D, Sharnbrook MK44 1LG, Beds, England
来源
BMC BIOINFORMATICS | 2012年 / 13卷
基金
英国生物技术与生命科学研究理事会;
关键词
PROTEIN-PROTEIN INTERACTIONS; INFORMATION EXTRACTION; MANUAL CURATION; NETWORKS; CORPUS; IDENTIFICATION; DATABASE; PARSE;
D O I
10.1186/1471-2105-13-172
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Increasingly biological text mining research is focusing on the extraction of complex relationships relevant to the construction and curation of biological networks and pathways. However, one important category of pathway - metabolic pathways - has been largely neglected. Here we present a relatively simple method for extracting metabolic reaction information from free text that scores different permutations of assigned entities (enzymes and metabolites) within a given sentence based on the presence and location of stemmed keywords. This method extends an approach that has proved effective in the context of the extraction of protein-protein interactions. Results: When evaluated on a set of manually-curated metabolic pathways using standard performance criteria, our method performs surprisingly well. Precision and recall rates are comparable to those previously achieved for the well-known protein-protein interaction extraction task. Conclusions: We conclude that automated metabolic pathway construction is more tractable than has often been assumed, and that (as in the case of protein-protein interaction extraction) relatively simple text-mining approaches can prove surprisingly effective. It is hoped that these results will provide an impetus to further research and act as a useful benchmark for judging the performance of more sophisticated methods that are yet to be developed.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] A text-mining system for extracting metabolic reactions from full-text articles
    Jan Czarnecki
    Irene Nobeli
    Adrian M Smith
    Adrian J Shepherd
    [J]. BMC Bioinformatics, 13
  • [2] A Text-Mining System for Concept Annotation in Biomedical Full Text Articles
    Wei, Chih-Hsuan
    Allot, Alexis
    Leaman, Robert
    Lu, Zhiyong
    [J]. ACM-BCB'19: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, 2019, : 540 - 540
  • [3] Extracting and quantifying eponyms in full-text articles
    Cabanac, Guillaume
    [J]. SCIENTOMETRICS, 2014, 98 (03) : 1631 - 1645
  • [4] Extracting and quantifying eponyms in full-text articles
    Guillaume Cabanac
    [J]. Scientometrics, 2014, 98 : 1631 - 1645
  • [5] A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts
    Westergaard, David
    Staerfeldt, Hans-Henrik
    Tonsberg, Christian
    Jensen, Lars Juhl
    Brunak, Soren
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2018, 14 (02)
  • [6] Challenges for automatically extracting molecular interactions from full-text articles
    Tara McIntosh
    James R Curran
    [J]. BMC Bioinformatics, 10
  • [7] Challenges for automatically extracting molecular interactions from full-text articles
    McIntosh, Tara
    Curran, James R.
    [J]. BMC BIOINFORMATICS, 2009, 10 : 311
  • [8] Full-text journal articles on the Internet
    Prakash, CS
    [J]. AUSTRALASIAN BIOTECHNOLOGY, 1998, 8 (05) : 308 - 309
  • [9] PMC text mining subset in BioC: about three million full-text articles and growing
    Comeau, Donald C.
    Wei, Chih-Hsuan
    Dogan, Rezarta Islamaj
    Lu, Zhiyong
    [J]. BIOINFORMATICS, 2019, 35 (18) : 3533 - 3535
  • [10] tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles
    Cejuela, Juan Miguel
    McQuilton, Peter
    Ponting, Laura
    Marygold, Steven J.
    Stefancsik, Raymund
    Millburn, Gillian H.
    Rost, Burkhard
    [J]. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2014,