A New Chinese Named Entity Recognition Method for Pig Disease Domain Based on Lexicon-Enhanced BERT and Contrastive Learning

被引:0
|
作者
Peng, Cheng [1 ,2 ,3 ]
Wang, Xiajun [1 ,4 ]
Li, Qifeng [1 ,2 ,3 ]
Yu, Qinyang [1 ,2 ,3 ]
Jiang, Ruixiang [1 ,2 ,3 ]
Ma, Weihong [1 ,2 ,3 ]
Wu, Wenbiao [1 ,2 ,3 ]
Meng, Rui [1 ,2 ,3 ]
Li, Haiyan [1 ,2 ,3 ]
Huai, Heju [1 ,2 ,3 ]
Wang, Shuyan [1 ,2 ,3 ]
He, Longjuan [5 ]
机构
[1] Beijing Acad Agr & Forestry Sci, Informat Technol Res Ctr, Beijing 100097, Peoples R China
[2] Natl Innovat Ctr Digital Technol Anim Husb, Beijing 100097, Peoples R China
[3] Natl Engn Res Ctr Informat Technol Agr, Beijing 100097, Peoples R China
[4] Hubei Univ, Fac Resources & Environm Sci, Wuhan 430061, Peoples R China
[5] Chinese Acad Agr Sci, Inst Agr Econ & Dev, Beijing 100081, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 16期
关键词
pig disease; Chinese named entity recognition; lexicon-enhanced BERT; contrastive learning; small sample;
D O I
10.3390/app14166944
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Featured Application Our work provides reliable technical support for the information extraction of pig diseases in Chinese . It can be applied to other domain - specific fields, thereby facilitating seamless adaptation for named entity identification across diverse contexts .Abstract Named Entity Recognition (NER) is a fundamental and pivotal stage in the development of various knowledge-based support systems, including knowledge retrieval and question-answering systems. In the domain of pig diseases, Chinese NER models encounter several challenges, such as the scarcity of annotated data, domain-specific vocabulary, diverse entity categories, and ambiguous entity boundaries. To address these challenges, we propose PDCNER, a Pig Disease Chinese Named Entity Recognition method leveraging lexicon-enhanced BERT and contrastive learning. Firstly, we construct a domain-specific lexicon and pre-train word embeddings in the pig disease domain. Secondly, we integrate lexicon information of pig diseases into the lower layers of BERT using a Lexicon Adapter layer, which employs char-word pair sequences. Thirdly, to enhance feature representation, we propose a lexicon-enhanced contrastive loss layer on top of BERT. Finally, a Conditional Random Field (CRF) layer is employed as the model's decoder. Experimental results show that our proposed model demonstrates superior performance over several mainstream models, achieving a precision of 87.76%, a recall of 86.97%, and an F1-score of 87.36%. The proposed model outperforms BERT-BiLSTM-CRF and LEBERT by 14.05% and 6.8%, respectively, with only 10% of the samples available, showcasing its robustness in data scarcity scenarios. Furthermore, the model exhibits generalizability across publicly available datasets. Our work provides reliable technical support for the information extraction of pig diseases in Chinese and can be easily extended to other domains, thereby facilitating seamless adaptation for named entity identification across diverse contexts.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Chinese Named Entity Recognition in the Geoscience Domain Based on BERT
    Lv, Xia
    Xie, Zhong
    Xu, Dexin
    Jin, Xiangguo
    Ma, Kai
    Tao, Liufeng
    Qiu, Qinjun
    Pan, Yongsheng
    EARTH AND SPACE SCIENCE, 2022, 9 (03)
  • [2] Lexicon Graph Adapter Based BERT Model for Chinese Named Entity Recognition
    Liu, Jie
    Liu, Peipei
    Ren, Yimo
    Wang, Jinfa
    Zhu, Hongsong
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT V, KSEM 2024, 2024, 14888 : 95 - 105
  • [3] Chinese Named Entity Recognition Based on Template and Contrastive Learning
    Zhu, Jingjing
    Cai, Tianyu
    Zhao, Zhenyu
    Ju, Shenggen
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, NLPCC 2024, 2025, 15359 : 392 - 405
  • [4] Chinese named entity recognition model based on BERT
    Liu, Hongshuai
    Jun, Ge
    Zheng, Yuanyuan
    2020 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE COMMUNICATION AND NETWORK SECURITY (CSCNS2020), 2021, 336
  • [5] Lexicon enhanced Chinese named entity recognition with pointer network
    Guo, Qian
    Guo, Yi
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (17): : 14535 - 14555
  • [6] Sequential lexicon enhanced bidirectional encoder representations from transformers: Chinese named entity recognition using sequential lexicon enhanced BERT
    Liu, Xin
    Zhao, Jiashan
    Yao, Junping
    Zheng, Hao
    Wang, Zhong
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [7] Lexicon enhanced Chinese named entity recognition with pointer network
    Qian Guo
    Yi Guo
    Neural Computing and Applications, 2022, 34 : 14535 - 14555
  • [8] Chinese Named Entity Recognition Method in History and Culture Field Based on BERT
    Shuang Liu
    Hui Yang
    Jiayi Li
    Simon Kolmanič
    International Journal of Computational Intelligence Systems, 14
  • [9] Chinese Named Entity Recognition Method in History and Culture Field Based on BERT
    Liu, Shuang
    Yang, Hui
    Li, Jiayi
    Kolmanic, Simon
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2021, 14 (01)
  • [10] Enhanced Chinese Domain Named Entity Recognition: An Approach with Lexicon Boundary and Frequency Weight Features
    Guo, Yan
    Feng, Shixiang
    Liu, Fujiang
    Lin, Weihua
    Liu, Hongchen
    Wang, Xianbin
    Su, Junshun
    Gao, Qiankai
    APPLIED SCIENCES-BASEL, 2024, 14 (01):