Self-supervised Bidirectional Prompt Tuning for Entity-enhanced Pre-trained Language Model

被引:0
|
作者
Zou, Jiaxin [1 ]
Xu, Xianghong [1 ]
Hou, Jiawei [2 ]
Yang, Qiang [2 ]
Zheng, Hai-Tao [1 ,3 ]
机构
[1] Tsinghua Univ, Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China
[2] Weixin Grp, Dept Search & Applicat, Tencent, Peoples R China
[3] Pengcheng Lab, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/IJCNN54540.2023.10192045
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the promotion of the pre-training paradigm, researchers are increasingly focusing on injecting external knowledge, such as entities and triplets from knowledge graphs, into pre-trained language models (PTMs) to improve their understanding and logical reasoning abilities. This results in significant improvements in natural language understanding and generation tasks and some level of interpretability. In this paper, we propose a novel two-stage entity knowledge enhancement pipeline for Chinese pre-trained models based on "bidirectional" prompt tuning. The pipeline consists of a "forward" stage, in which we construct fine-grained entity type prompt templates to boost PTMs injected with entity knowledge, and a "backward" stage, where the trained templates are used to generate type-constrained context-dependent negative samples for contrastive learning. Experiments on six classification tasks in the Chinese Language Understanding Evaluation (CLUE) benchmark demonstrate that our approach significantly improves upon the baseline results in most datasets, particularly those that have a strong reliance on diverse and extensive knowledge.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Pre-trained Language Model Representations for Language Generation
    Edunov, Sergey
    Baevski, Alexei
    Auli, Michael
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4052 - 4059
  • [42] Satellite and instrument entity recognition using a pre-trained language model with distant supervision
    Lin, Ming
    Jin, Meng
    Liu, Yufu
    Bai, Yuqi
    [J]. INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2022, 15 (01) : 1290 - 1304
  • [43] BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition
    Naseem, Usman
    Khushi, Matloob
    Reddy, Vinay
    Rajendran, Sakthivel
    Razzak, Imran
    Kim, Jinman
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [44] Pruning Pre-trained Language ModelsWithout Fine-Tuning
    Jiang, Ting
    Wang, Deqing
    Zhuang, Fuzhen
    Xie, Ruobing
    Xia, Feng
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 594 - 605
  • [45] Personalised soft prompt tuning in pre-trained language models: Bridging multitask transfer learning and crowdsourcing learning
    Tian, Zeshu
    Zhang, Hongli
    Wang, Yan
    [J]. Knowledge-Based Systems, 2024, 305
  • [46] Named-Entity Recognition for a Low-resource Language using Pre-Trained Language Model
    Yohannes, Hailemariam Mehari
    Amagasa, Toshiyuki
    [J]. 37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 837 - 844
  • [47] Adder Encoder for Pre-trained Language Model
    Ding, Jianbang
    Zhang, Suiyun
    Li, Linlin
    [J]. CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 339 - 347
  • [48] Span Fine-tuning for Pre-trained Language Models
    Bao, Rongzhou
    Zhang, Zhuosheng
    Zhao, Hai
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1970 - 1979
  • [49] Interpretabilty of Speech Emotion Recognition modelled using Self-Supervised Speech and Text Pre-Trained Embeddings
    Girish, K. V. Vijay
    Konjeti, Srikanth
    Vepa, Jithendra
    [J]. INTERSPEECH 2022, 2022, : 4496 - 4500
  • [50] A Survey of Knowledge Enhanced Pre-Trained Language Models
    Hu, Linmei
    Liu, Zeyi
    Zhao, Ziwang
    Hou, Lei
    Nie, Liqiang
    Li, Juanzi
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (04) : 1413 - 1430