Virtual prompt pre-training for prototype-based few-shot relation extraction

被引:26
|
作者
He, Kai [1 ,2 ]
Huang, Yucheng [1 ,2 ]
Mao, Rui [3 ]
Gong, Tieliang [1 ,2 ]
Li, Chen [1 ,2 ]
Cambria, Erik [3 ,4 ]
机构
[1] Xi An Jiao Tong Univ, Sch Comp Sci & Technol, Xian, Peoples R China
[2] Shanxi Prov Key Lab Satellite & Terr Network Techn, Xian, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[4] Nanyang Technol Univ, Sch Comp Sci & Engn, 50 Nanyang Ave,Block N4 02a, Singapore 639798, Singapore
基金
中国国家自然科学基金;
关键词
Few-shot learning; Information extraction; Prompt tuning; Pre-trained Language Model; SENTIMENT ANALYSIS; LANGUAGE MODELS;
D O I
10.1016/j.eswa.2022.118927
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Prompt tuning with pre-trained language models (PLM) has exhibited outstanding performance by reducing the gap between pre-training tasks and various downstream applications, which requires additional labor efforts in label word mappings and prompt template engineering. However, in a label intensive research domain, e.g., few-shot relation extraction (RE), manually defining label word mappings is particularly challenging, because the number of utilized relation label classes with complex relation names can be extremely large. Besides, the manual prompt development in natural language is subjective to individuals. To tackle these issues, we propose a virtual prompt pre-training method, projecting the virtual prompt to latent space, then fusing with PLM parameters. The pre-training is entity-relation-aware for RE, including the tasks of mask entity prediction, entity typing, distant supervised RE, and contrastive prompt pre-training. The proposed pre-training method can provide robust initialization for prompt encoding, while maintaining the interaction with the PLM. Furthermore, the virtual prompt can effectively avoid the labor efforts and the subjectivity issue in label word mapping and prompt template engineering. Our proposed prompt-based prototype network delivers a novel learning paradigm to model entities and relations via the probability distribution and Euclidean distance of the predictions of query instances and prototypes. The results indicate that our model yields an averaged accuracy gain of 4.21% on two few-shot datasets over strong RE baselines. Based on our proposed framework, our pre-trained model outperforms the strongest RE-related PLM by 6.52%.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction
    Luo, Da
    Gan, Yanglei
    Hou, Rui
    Lin, Run
    Liu, Qiao
    Cai, Yuxiang
    Gao, Wannian
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 18742 - 18750
  • [2] Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning
    Sun, Tianxiang
    He, Zhengfu
    Zhu, Qin
    Qiu, Xipeng
    Huang, Xuanjing
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 11156 - 11172
  • [3] A lightweight approach based on prompt for few-shot relation extraction
    Zhang, Ying
    Huang, Wencheng
    Dang, Depeng
    [J]. COMPUTER SPEECH AND LANGUAGE, 2024, 84
  • [4] Effectiveness of Pre-training for Few-shot Intent Classification
    Zhang, Haode
    Zhang, Yuwei
    Zhan, Li-Ming
    Chen, Jiaxin
    Shi, Guangyuan
    Wu, Xiao-Ming
    Lam, Albert Y. S.
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1114 - 1120
  • [5] Unified Multi-modal Pre-training for Few-shot Sentiment Analysis with Prompt-based Learning
    Yu, Yang
    Zhang, Dong
    Li, Shoushan
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,
  • [6] Continual Few-Shot Relation Extraction with Prompt-Based Contrastive Learning
    Wu, Fei
    Zhang, Chong
    Tan, Zhen
    Xu, Hao
    Ge, Bin
    [J]. WEB AND BIG DATA, PT IV, APWEB-WAIM 2023, 2024, 14334 : 312 - 327
  • [7] Few-Shot Dataset Distillation via Translative Pre-Training
    Liu, Songhua
    Wang, Xinchao
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18608 - 18618
  • [8] Consistent Prototype Learning for Few-Shot Continual Relation Extraction
    Chen, Xiudi
    Wu, Hui
    Shi, Xiaodong
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7409 - 7422
  • [9] Label Semantic Aware Pre-training for Few-shot Text Classification
    Mueller, Aaron
    Krone, Jason
    Romeo, Salvatore
    Mansour, Saab
    Mansimov, Elman
    Zhang, Yi
    Roth, Dan
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 8318 - 8334
  • [10] A Prototype Network Enhanced Relation Semantic Representation for Few-shot Relation Extraction
    Haitao He
    Haoran Niu
    Jianzhou Feng
    Qian Wang
    Qikai Wei
    [J]. Human-Centric Intelligent Systems, 2023, 3 (1): : 1 - 12