Task-guided Disentangled Tuning for Pretrained Language Models

被引:0
|
作者
Zeng, Jiali [1 ]
Jiang, Yufan [1 ]
Wu, Shuangzhi [1 ]
Yin, Yongjing [2 ]
Li, Mu [1 ]
机构
[1] Tencent Cloud Xiaowei, Beijing, Peoples R China
[2] Zhejiang Univ, Westlake Univ, Hangzhou, Zhejiang, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pretrained language models (PLMs) trained on large-scale unlabeled corpus are typically finetuned on task-specific downstream datasets, which have produced state-of-the-art results on various NLP tasks. However, the data discrepancy issue in domain and scale makes finetuning fail to efficiently capture task-specific patterns, especially in the low data regime. To address this issue, we propose Task-guided Disentangled Tuning (TDT) for PLMs, which enhances the generalization of representations by disentangling task-relevant signals from the entangled representations. For a given task, we introduce a learnable confidence model to detect indicative guidance from context, and further propose a disentangled regularization to mitigate the over-reliance problem. Experimental results on GLUE and CLUE benchmarks show that TDT gives consistently better results than fine-tuning with different PLMs, and extensive analysis demonstrates the effectiveness and robustness of our method. Code is available at https://github.com/lemon0830/TDT.
引用
收藏
页码:3126 / 3137
页数:12
相关论文
共 50 条
  • [21] Task-Guided Domain Gap Reduction for Monocular Depth Prediction in Endoscopy
    Rau, Anita
    Bhattarai, Binod
    Agapito, Lourdes
    Stoyanov, Danail
    DATA ENGINEERING IN MEDICAL IMAGING, DEMI 2023, 2023, 14314 : 111 - 122
  • [22] Multi Task-Guided 6D Object Pose Estimation
    Thu-Uyen Nguyen
    Van-Duc Vu
    Van-Thiep Nguyen
    Ngoc-Anh Hoang
    Duy-Quang Vu
    Duc-Thanh Tran
    Khanh-Toan Phan
    Anh-Truong Mai
    Van-Hiep Duong
    Cong-Trinh Chan
    Ngoc-Trung Ho
    Quang-Tri Duong
    Phuc-Quan Ngo
    Dinh-Cuong Hoang
    PROCEEDINGS OF THE 2024 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION TECHNOLOGY, ICIIT 2024, 2024, : 215 - 222
  • [23] Unsupervised Paraphrasing with Pretrained Language Models
    Niu, Tong
    Yavuz, Semih
    Zhou, Yingbo
    Keskar, Nitish Shirish
    Wang, Huan
    Xiong, Caiming
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5136 - 5150
  • [24] Fine-Tuning Pretrained Language Models to Enhance Dialogue Summarization in Customer Service Centers
    Yun, Jiseon
    Sohn, Jae Eui
    Kyeong, Sunghyon
    PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, ICAIF 2023, 2023, : 365 - 373
  • [25] Noise-Robust Fine-Tuning of Pretrained Language Models via External Guidance
    Wang, Song
    Tan, Zhen
    Guo, Ruocheng
    Li, Jundong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 12528 - 12540
  • [26] Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning
    Wei, Colin
    Xie, Sang Michael
    Ma, Tengyu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
  • [27] PAC-tuning: Fine-tuning Pretrained Language Models with PAC-driven Perturbed Gradient Descent
    Liu, Guangliang
    Xue, Zhiyu
    Zhang, Xitong
    Johnson, Kristen Marie
    Wang, Rongrong
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 12178 - 12189
  • [28] Vietnamese Sentiment Analysis: An Overview and Comparative Study of Fine-tuning Pretrained Language Models
    Dang Van Thin
    Duong Ngoc Hao
    Ngan Luu-Thuy Nguyen
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (06)
  • [29] IIITT at CASE 2021 Task 1: Leveraging Pretrained Language Models for Multilingual Protest Detection
    Jada, Pawan Kalyan
    Reddy, Duddukunta Sashidhar
    Hande, Adeep
    Priyadharshini, Ruba
    Sakuntharaj, Ratnasingam
    Chakravarthi, Bharathi Raja
    CASE 2021: THE 4TH WORKSHOP ON CHALLENGES AND APPLICATIONS OF AUTOMATED EXTRACTION OF SOCIO-POLITICAL EVENTS FROM TEXT (CASE), 2021, : 98 - 104
  • [30] Weakly Supervised Concept Map Generation Through Task-Guided Graph Translation
    Lu J.
    Dong X.
    Yang C.
    IEEE Transactions on Knowledge and Data Engineering, 2023, 35 (10) : 10871 - 10883