Challenges and Advances in Information Extraction from Scientific Literature: a Review

被引:21
|
作者
Hong, Zhi [1 ]
Ward, Logan [2 ]
Chard, Kyle [1 ,2 ]
Blaiszik, Ben [1 ,2 ]
Foster, Ian [1 ,2 ]
机构
[1] Univ Chicago, Chicago, IL 60637 USA
[2] Argonne Natl Lab, Lemont, IL USA
关键词
Information extraction; Text mining; Scientific data; PROPERTY DATA; RECOGNITION; GENERATION; RECAPTCHA; STANDARD; SYSTEM; WEB;
D O I
10.1007/s11837-021-04902-9
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Scientific articles have long been the primary means of disseminating scientific discoveries. Over the centuries, valuable data and potentially groundbreaking insights have been collected and buried deep in the mountain of publications. In materials engineering, such data are spread across technical handbooks specification sheets, journal articles, and laboratory notebooks in myriad formats. Extracting information from papers on a large scale has been a tedious and time-consuming job to which few researchers have wanted to devote their limited time and effort, yet is an activity that is essential for modern data-driven design practices. However, in recent years, significant progress has been made by the computer science community on techniques for automated information extraction from free text. Yet, transformative application of these techniques to scientific literature remains elusive-due not to a lack of interest or effort but to technical and logistical challenges. Using the challenges in the materials science literature as a driving motivation, we review the gaps between state-of-the-art information extraction methods and the practical application of such methods to scientific texts, and offer a comprehensive overview of work that can be undertaken to close these gaps.
引用
收藏
页码:3383 / 3400
页数:18
相关论文
共 50 条
  • [21] Extraction of biological interaction networks from scientific literature
    Skusa, A
    Rüegg, A
    Köhler, A
    [J]. BRIEFINGS IN BIOINFORMATICS, 2005, 6 (03) : 263 - 276
  • [22] LAutomated Chemical Reaction Extraction from Scientific Literature
    Guo, Jiang
    Ibanez-Lopez, A. Santiago
    Gao, Hanyu
    Quach, Victor
    Coley, Connor W.
    Jensen, Klavs F.
    Barzilay, Regina
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, 62 (09) : 2035 - 2045
  • [23] HyperPIE: Hyperparameter Information Extraction from Scientific Publications
    Saier, Tarek
    Ohta, Mayumi
    Asakura, Takuto
    Faerber, Michael
    [J]. ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT II, 2024, 14609 : 254 - 269
  • [24] Advances and Prospects of Information Extraction from Point Clouds
    [J]. Lin, Xiangguo (linxiangguo@casm.ac.cn), 1600, SinoMaps Press (46):
  • [25] The extraction of useful information from the biomedical literature
    Kostoff, R
    [J]. ACADEMIC MEDICINE, 2001, 76 (12) : 1265 - 1270
  • [26] Policy challenges to community energy in the EU: A systematic review of the scientific literature
    Busch, Henner
    Ruggiero, Salvatore
    Isakovic, Aljosa
    Hansen, Teis
    [J]. RENEWABLE & SUSTAINABLE ENERGY REVIEWS, 2021, 151
  • [27] Review of Knowledge Elements Extraction in Scientific Literature Based on Deep Learning
    Li, Guangjian
    Yuan, Yue
    [J]. Data Analysis and Knowledge Discovery, 2023, 7 (07) : 1 - 17
  • [28] Investigations on Scientific Literature Meta Information Extraction Using Large Language Models
    Guo, Menghao
    Wu, Fan
    Jiang, Jinling
    Yan, Xiaoran
    Chen, Guangyong
    Li, Wenhui
    Zhao, Yunhong
    Sun, Zeyi
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH, ICKG, 2023, : 249 - 254
  • [29] ADVANCES IN INFORMATION EXTRACTION TECHNIQUES
    NAGY, G
    [J]. REMOTE SENSING OF ENVIRONMENT, 1984, 15 (02) : 167 - 175
  • [30] Challenges of vaccination information system implementation: A systematic literature review
    Rahmadhan, Muhamad Adhytia Wana Putra
    Handayani, Putu Wuri
    [J]. HUMAN VACCINES & IMMUNOTHERAPEUTICS, 2023, 19 (02)