Information extraction from full text scientific articles: Where are the keywords?

被引:116
|
作者
Shah, PK
Perez-Iratxeta, C
Bork, P [1 ]
Andrade, MA
机构
[1] European Mol Biol Lab, Heidelberg, Germany
[2] Max Delbruck Ctr Mol Med, Dept Bioinformat, Berlin, Germany
关键词
D O I
10.1186/1471-2105-4-20
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: To date, many of the methods for information extraction of biological information from scientific articles are restricted to the abstract of the article. However, full text articles in electronic version, which offer larger sources of data, are currently available. Several questions arise as to whether the effort of scanning full text articles is worthy, or whether the information that can be extracted from the different sections of an article can be relevant. Results: In this work we addressed those questions showing that the keyword content of the different sections of a standard scientific article ( abstract, introduction, methods, results, and discussion) is very heterogeneous. Conclusions: Although the abstract contains the best ratio of keywords per total of words, other sections of the article may be a better source of biologically relevant data.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Information extraction from full text scientific articles: Where are the keywords?
    Parantu K Shah
    Carolina Perez-Iratxeta
    Peer Bork
    Miguel A Andrade
    [J]. BMC Bioinformatics, 4
  • [2] Layout-aware text extraction from full-text PDF of scientific articles
    Ramakrishnan, Cartic
    Patnia, Abhishek
    Hovy, Eduard
    Burns, Gully A. P. C.
    [J]. SOURCE CODE FOR BIOLOGY AND MEDICINE, 2012, 7 (01):
  • [3] Information extraction from scientific articles: a survey
    Nasar, Zara
    Jaffry, Syed Waqar
    Malik, Muhammad Kamran
    [J]. SCIENTOMETRICS, 2018, 117 (03) : 1931 - 1990
  • [4] Information extraction from scientific articles: a survey
    Zara Nasar
    Syed Waqar Jaffry
    Muhammad Kamran Malik
    [J]. Scientometrics, 2018, 117 : 1931 - 1990
  • [5] Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords
    Koyabu, Shun
    Thi Thanh Thuy Phan
    Ohkawa, Takenao
    [J]. BIOMED RESEARCH INTERNATIONAL, 2015, 2015
  • [6] Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families
    Andrade, MA
    Valencia, A
    [J]. BIOINFORMATICS, 1998, 14 (07) : 600 - 607
  • [7] Bioinformatics applications of information extraction from scientific journal articles
    Humphreys, K
    Demetriou, G
    Gaizauskas, R
    [J]. JOURNAL OF INFORMATION SCIENCE, 2000, 26 (02) : 75 - 85
  • [8] Structured abstract summarization of scientific articles: Summarization using full-text section information
    Oh, Hanseok
    Nam, Seojin
    Zhu, Yongjun
    [J]. JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2023, 74 (02) : 234 - 248
  • [9] A Large Parallel Corpus of Full-Text Scientific Articles
    Soares, Felipe
    Moreira, Viviane Pereira
    Becker, Karin
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3459 - 3463
  • [10] Efficient Extraction of Protein-Protein Interactions from Full-Text Articles
    Hakenberg, Joerg
    Leaman, Robert
    Vo, Nguyen Ha
    Jonnalagadda, Siddhartha
    Sullivan, Ryan
    Miller, Christopher
    Tari, Luis
    Baral, Chitta
    Gonzalez, Graciela
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2010, 7 (03) : 481 - 494