GeoNER:Geological Named Entity Recognition with Enriched Domain Pre-Training Model and Adversarial Training

被引:0
|
作者
MA Kai [1 ,2 ]
HU Xinxin [1 ,2 ]
TIAN Miao [3 ]
TAN Yongjian [1 ,2 ]
ZHENG Shuai [1 ,2 ]
TAO Liufeng [3 ,4 ,5 ]
QIU Qinjun [3 ,4 ,5 ]
机构
[1] Hubei Key Laboratory of Intelligent Vision Based Monitoring for Hydroelectric Engineering, China Three Gorges University
[2] College of Computer and Information Technology, China Three Gorges University
[3] Key Laboratory of Geological Survey and Evaluation of Ministry of Education, China University of Geosciences
[4] School of Computer Science, China University of Geosciences
[5] Key Laboratory of Quantitative Resource Evaluation and Information Engineering, Ministry of Natural Resources, China University of
关键词
D O I
暂无
中图分类号
学科分类号
摘要
As important geological data, a geological report contains rich expert and geological knowledge, but the challenge facing current research into geological knowledge extraction and mining is how to render accurate understanding of geological reports guided by domain knowledge. While generic named entity recognition models/tools can be utilized for the processing of geoscience reports/documents, their effectiveness is hampered by a dearth of domain-specific knowledge, which in turn leads to a pronounced decline in recognition accuracy. This study summarizes six types of typical geological entities, with reference to the ontological system of geological domains and builds a high quality corpus for the task of geological named entity recognition(GNER). In addition, Geo Wo BERT-adv BGP(Geological Word-base BERTadversarial training Bi-directional Long Short-Term Memory Global Pointer) is proposed to address the issues of ambiguity, diversity and nested entities for the geological entities. The model first uses the fine-tuned word granularitybased pre-training model Geo Wo BERT(Geological Word-base BERT) and combines the text features that are extracted using the Bi LSTM(Bi-directional Long Short-Term Memory), followed by an adversarial training algorithm to improve the robustness of the model and enhance its resistance to interference, the decoding finally being performed using a global association pointer algorithm. The experimental results show that the proposed model for the constructed dataset achieves high performance and is capable of mining the rich geological information.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Virus Named Entity Recognition based on Pre-training Model
    Mou, Hanlin
    Zheng, Shanshan
    Wu, Haifang
    Li, Bojing
    He, Tingting
    Jiang, Xingpeng
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 473 - 476
  • [2] A pre-training and self-training approach for biomedical named entity recognition
    Gao, Shang
    Kotevska, Olivera
    Sorokine, Alexandre
    Christian, J. Blair
    [J]. PLOS ONE, 2021, 16 (02):
  • [3] Coarse-to-Fine Pre-training for Named Entity Recognition
    Xue, Mengge
    Yu, Bowen
    Zhang, Zhenyu
    Liu, Tingwen
    Zhang, Yue
    Bin Wang
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6345 - 6354
  • [4] Low-Resource Named Entity Recognition via the Pre-Training Model
    Chen, Siqi
    Pei, Yijie
    Ke, Zunwang
    Silamu, Wushour
    [J]. SYMMETRY-BASEL, 2021, 13 (05):
  • [5] PTWA: Pre-training with Word Attention for Chinese Named Entity Recognition
    Ma, Kaixin
    Liu, Meiling
    Zhao, Tiejun
    Zhou, Jiyun
    Yu, Yang
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [6] Language Model Pre-training Method in Machine Translation Based on Named Entity Recognition
    Li, Zhen
    Qu, Dan
    Xie, Chaojie
    Zhang, Wenlin
    Li, Yanxia
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2020, 29 (7-8)
  • [7] Quantifying the advantage of domain-specific pre-training on named entity recognition tasks in materials science
    Trewartha, Amalie
    Walker, Nicholas
    Huo, Haoyan
    Lee, Sanghoon
    Cruse, Kevin
    Dagdelen, John
    Dunn, Alexander
    Persson, Kristin A.
    Ceder, Gerbrand
    Jain, Anubhav
    [J]. PATTERNS, 2022, 3 (04):
  • [8] Nested Named Entity Recognition in Geotechnical Engineering Based on Pre-training and Information Enhancement
    Chen, Guanyu
    Hu, Yang
    Wang, Zuheng
    Song, Zhiquan
    Hu, Jun
    Yang, Tuo
    Wang, Quanyu
    [J]. ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 291 - 303
  • [9] Named Entity Recognition for Chinese Social Media with Domain Adversarial Training and Language Modeling
    Xu, Yong
    Lu, Qi
    Zhu, Muhua
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: DEEP LEARNING, PT II, 2019, 11728 : 687 - 699