Integrating Task Specific Information into Pretrained Language Models for Low Resource Fine Tuning

被引:0
|
作者
Wang, Rui [1 ]
Si, Shijing [1 ]
Wang, Guoyin [1 ,2 ]
Zhang, Lei [3 ]
Carin, Lawrence [1 ]
Henao, Ricardo [1 ]
机构
[1] Duke Univ, Durham, NC 27706 USA
[2] Amazon Alexa AI, Cambridge, MA USA
[3] Fidel Investments, Raleigh, NC USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pretrained Language Models (PLMs) have improved the performance of natural language understanding in recent years. Such models are pretrained on large corpora, which encode the general prior knowledge of natural languages but are agnostic to information characteristic of downstream tasks. This often results in overfitting when fine-tuned with low resource datasets where task-specific information is limited. In this paper, we integrate label information as a task-specific prior into the self-attention component of pretrained BERT models. Experiments on several benchmarks and real-word datasets suggest that the proposed approach can largely improve the performance of pretrained models when finetuning with small datasets. The code repository is released in https://github.com/RayWangWR/BERT_label_embedding.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Phased Instruction Fine-Tuning for Large Language Models
    Pang, Wei
    Zhou, Chuan
    Zhou, Xiao-Hua
    Wang, Xiaojie
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5735 - 5748
  • [42] Improve Performance of Fine-tuning Language Models with Prompting
    Yang, Zijian Gyozo
    Ligeti-Nagy, Noenn
    INFOCOMMUNICATIONS JOURNAL, 2023, 15 : 62 - 68
  • [43] HackMentor: Fine-Tuning Large Language Models for Cybersecurity
    Zhang, Jie
    Wen, Hui
    Deng, Liting
    Xin, Mingfeng
    Li, Zhi
    Li, Lun
    Zhu, Hongsong
    Sun, Limin
    2023 IEEE 22ND INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, BIGDATASE, CSE, EUC, ISCI 2023, 2024, : 452 - 461
  • [44] Fine-tuning language models to recognize semantic relations
    Roussinov, Dmitri
    Sharoff, Serge
    Puchnina, Nadezhda
    LANGUAGE RESOURCES AND EVALUATION, 2023, 57 (04) : 1463 - 1486
  • [45] Fine-tuning language models to recognize semantic relations
    Dmitri Roussinov
    Serge Sharoff
    Nadezhda Puchnina
    Language Resources and Evaluation, 2023, 57 : 1463 - 1486
  • [46] Fine-Tuning Language Models with Just Forward Passes
    Malladi, Sadhika
    Gao, Tianyu
    Nichani, Eshaan
    Damian, Alex
    Lee, Jason D.
    Chen, Danqi
    Arora, Sanjeev
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [47] AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages
    Ebrahimi, Abteen
    Mager, Manuel
    Oncevay, Arturo
    Chaudhary, Vishrav
    Chiruzzo, Luis
    Fan, Angela
    Ortega, John E.
    Ramos, Ricardo
    Rios, Annette
    Meza-Ruiz, Ivan
    Gimenez-Lugo, Gustavo A.
    Mager, Elisabeth
    Neubig, Graham
    Palmer, Alexis
    Coto-Solano, Rolando
    Ngoc Thang Vu
    Kann, Katharina
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6279 - 6299
  • [48] COMBINING CONTRASTIVE AND NON-CONTRASTIVE LOSSES FOR FINE-TUNING PRETRAINED MODELS IN SPEECH ANALYSIS
    Lux, Florian
    Chen, Ching-Yi
    Vu, Ngoc Thang
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 876 - 883
  • [49] IIITT at CASE 2021 Task 1: Leveraging Pretrained Language Models for Multilingual Protest Detection
    Jada, Pawan Kalyan
    Reddy, Duddukunta Sashidhar
    Hande, Adeep
    Priyadharshini, Ruba
    Sakuntharaj, Ratnasingam
    Chakravarthi, Bharathi Raja
    CASE 2021: THE 4TH WORKSHOP ON CHALLENGES AND APPLICATIONS OF AUTOMATED EXTRACTION OF SOCIO-POLITICAL EVENTS FROM TEXT (CASE), 2021, : 98 - 104
  • [50] Clinical information extraction for lower-resource languages and domains with few-shot learning using pretrained language models and prompting
    Richter-Pechanski, Phillip
    Wiesenbach, Philipp
    Schwab, Dominic Mathias
    Kiriakou, Christina
    Geis, Nicolas
    Dieterich, Christoph
    Frank, Anette
    NATURAL LANGUAGE PROCESSING, 2024,