Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques

被引:4
|
作者
Qinjun Qiu
Zhong Xie
Liang Wu
Liufeng Tao
机构
[1] China University of Geosciences,School of Geography and Information Engineering
[2] National Engineering Research Center of Geographic Information System,undefined
来源
Earth Science Informatics | 2020年 / 13卷
关键词
Geoscience document; Knowledge graph; Geological text mining; Natural language processing;
D O I
暂无
中图分类号
学科分类号
摘要
A large number of georeferenced quantitative data about rock and geoscience surveys are buried in geological documents and remain unused. Data analytics and information extraction offer opportunities to use this data for improved understanding of ore forming processes and to enhance our knowledge. Extracting spatiotemporal and semantic information from a set of geological documents enables us to develop a rich representation of the geoscience knowledge recorded in unstructured text written in Chinese. This paper presents the workflow for spatiotemporal and semantic information extraction, which is a geological document analysis approach that uses automated techniques for browsing and searching relevant geological content. The developed workflow applies spatial and temporal gazetteer matching, pattern-based rules and spatiotemporal relationship extraction to identify and label terms in geological text documents. It offers a representation of contextual information in knowledge graph form, extracts a set of relevant tables and figures, and queries a list of relevant documents by using geological topic information. Here, text mining techniques are used to facilitate the analysis of geological knowledge and to show the effectiveness of text analysis for improving the rapid assessment of a massive number of documents. Furthermore, autogenerated keyword suggestions derived from extracted keyword associations are used to reduce document search efforts. This research illustrates the usefulness and effectiveness of the developed information extraction workflow and demonstrates the potential of incorporating text mining and NLP techniques for geoscience.
引用
收藏
页码:1393 / 1410
页数:17
相关论文
共 50 条
  • [31] Semantic Information in Medical Information Systems: Utilization of Text Mining Techniques to Analyze Medical Diagnoses
    Holzinger, Andreas
    Geierhofer, Regina
    Moedritscher, Felix
    Tatzl, Roland
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2008, 14 (22) : 3781 - 3795
  • [32] Optimal dynamic treatment regime estimation using information extraction from unstructured clinical text
    Zhou, Nina
    Brook, Robert D.
    Dinov, Ivo D.
    Wang, Lu
    BIOMETRICAL JOURNAL, 2022, 64 (04) : 805 - 817
  • [33] Using text mining techniques to extract phenotypic information from the PhenoCHF corpus
    Noha Alnazzawi
    Paul Thompson
    Riza Batista-Navarro
    Sophia Ananiadou
    BMC Medical Informatics and Decision Making, 15
  • [34] Using text mining techniques to extract phenotypic information from the PhenoCHF corpus
    Alnazzawi, Noha
    Thompson, Paul
    Batista-Navarro, Riza
    Ananiadou, Sophia
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2015, 15
  • [35] Vulcan: Automatic extraction and analysis of cyber threat intelligence from unstructured text
    Jo, Hyeonseong
    Lee, Yongjae
    Shin, Seungwon
    COMPUTERS & SECURITY, 2022, 120
  • [36] Evaluation of information retrieval and text mining tools on automatic named entity extraction
    Kumar, Nishant
    De Beer, Jan
    Vanthienen, Jan
    Moens, Marie-Francine
    INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2006, 3975 : 666 - 667
  • [37] Automatic topics extraction from crowdsourced cyclists near-miss and collision reports using text mining and Artificial Neural Networks
    Kwayu, Keneth Morgan
    Kwigizile, Valerian
    Lee, Kevin
    Oh, Jun-Seok
    Nelson, Trisalyn
    INTERNATIONAL JOURNAL OF TRANSPORTATION SCIENCE AND TECHNOLOGY, 2022, 11 (04) : 767 - 779
  • [38] Attention information extraction of the foreign visitors using text mining
    Tsujii, Koichi
    Fujita, Yoshikatsu
    Tsuda, Kazuhiko
    International Journal of Intelligent Systems Technologies and Applications, 2013, 12 (3-4) : 194 - 206
  • [39] Utilizing Hubel Wiesel Models for Semantic Associations and Topics Extraction from Unstructured Text
    Tiwari, Sandeep
    Ramanathan, Kiruthika
    2011 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2011, : 892 - 898
  • [40] Automatic Extraction of Semantic Relations by Using Web Statistical Information
    Borzi, Valeria
    Faro, Simone
    Pavone, Arianna
    GRAPH-BASED REPRESENTATION AND REASONING, 2014, 8577 : 174 - 187