Automatic spatiotemporal and semantic information extraction from unstructured geoscience reports using text mining techniques

被引:4
|
作者
Qinjun Qiu
Zhong Xie
Liang Wu
Liufeng Tao
机构
[1] China University of Geosciences,School of Geography and Information Engineering
[2] National Engineering Research Center of Geographic Information System,undefined
来源
Earth Science Informatics | 2020年 / 13卷
关键词
Geoscience document; Knowledge graph; Geological text mining; Natural language processing;
D O I
暂无
中图分类号
学科分类号
摘要
A large number of georeferenced quantitative data about rock and geoscience surveys are buried in geological documents and remain unused. Data analytics and information extraction offer opportunities to use this data for improved understanding of ore forming processes and to enhance our knowledge. Extracting spatiotemporal and semantic information from a set of geological documents enables us to develop a rich representation of the geoscience knowledge recorded in unstructured text written in Chinese. This paper presents the workflow for spatiotemporal and semantic information extraction, which is a geological document analysis approach that uses automated techniques for browsing and searching relevant geological content. The developed workflow applies spatial and temporal gazetteer matching, pattern-based rules and spatiotemporal relationship extraction to identify and label terms in geological text documents. It offers a representation of contextual information in knowledge graph form, extracts a set of relevant tables and figures, and queries a list of relevant documents by using geological topic information. Here, text mining techniques are used to facilitate the analysis of geological knowledge and to show the effectiveness of text analysis for improving the rapid assessment of a massive number of documents. Furthermore, autogenerated keyword suggestions derived from extracted keyword associations are used to reduce document search efforts. This research illustrates the usefulness and effectiveness of the developed information extraction workflow and demonstrates the potential of incorporating text mining and NLP techniques for geoscience.
引用
收藏
页码:1393 / 1410
页数:17
相关论文
共 50 条
  • [21] Deep Text Mining for Automatic Keyphrase Extraction from Text Documents
    Abulaish, Muhammad
    Jahiruddin
    Dey, Lipika
    JOURNAL OF INTELLIGENT SYSTEMS, 2011, 20 (04) : 327 - 351
  • [22] Extraction of protein interaction information from unstructured text using a link grammar parser
    Seoud, Rania A. Abul
    Youssef, Abou-Bakr M.
    Kadah, Yasser M.
    2007 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS: ICCES '07, 2007, : 70 - +
  • [23] EnvMine: A text-mining system for the automatic extraction of contextual information
    Tamames, Javier
    de Lorenzo, Victor
    BMC BIOINFORMATICS, 2010, 11
  • [24] EnvMine: A text-mining system for the automatic extraction of contextual information
    Javier Tamames
    Victor de Lorenzo
    BMC Bioinformatics, 11
  • [25] Information Extraction from Text Based on Semantic Inferentialism
    Pinheiro, Vladia
    Pequeno, Tarcisio
    Furtado, Vasco
    Nogueira, Douglas
    FLEXIBLE QUERY ANSWERING SYSTEMS: 8TH INTERNATIONAL CONFERENCE, FQAS 2009, 2009, 5822 : 333 - 344
  • [26] Improved Automatic Keyphrase Extraction by Using Semantic Information
    Wang, XiaoLing
    Mu, DeJun
    Fang, Jun
    INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL 1, PROCEEDINGS, 2008, : 1061 - 1065
  • [27] Automatic classification of academic documents using text mining techniques
    Nunez, Haydemar
    Ramos, Esmeralda
    2012 XXXVIII CONFERENCIA LATINOAMERICANA EN INFORMATICA (CLEI), 2012,
  • [28] Semantic information extraction and search of mineral exploration data using text mining and deep learning methods
    Qiu, Qinjun
    Tian, Miao
    Tao, Liufeng
    Xie, Zhong
    Ma, Kai
    ORE GEOLOGY REVIEWS, 2024, 165
  • [29] A general framework for subjective information extraction from unstructured English text
    Mangassarian, Hratch
    Artail, Hassan
    DATA & KNOWLEDGE ENGINEERING, 2007, 62 (02) : 352 - 367
  • [30] Extraction of protein interaction information from unstructured text using a context-free grammar
    Temkin, JM
    Gilder, MR
    BIOINFORMATICS, 2003, 19 (16) : 2046 - 2053