Task-guided Disentangled Tuning for Pretrained Language Models

被引:0
|
作者
Zeng, Jiali [1 ]
Jiang, Yufan [1 ]
Wu, Shuangzhi [1 ]
Yin, Yongjing [2 ]
Li, Mu [1 ]
机构
[1] Tencent Cloud Xiaowei, Beijing, Peoples R China
[2] Zhejiang Univ, Westlake Univ, Hangzhou, Zhejiang, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pretrained language models (PLMs) trained on large-scale unlabeled corpus are typically finetuned on task-specific downstream datasets, which have produced state-of-the-art results on various NLP tasks. However, the data discrepancy issue in domain and scale makes finetuning fail to efficiently capture task-specific patterns, especially in the low data regime. To address this issue, we propose Task-guided Disentangled Tuning (TDT) for PLMs, which enhances the generalization of representations by disentangling task-relevant signals from the entangled representations. For a given task, we introduce a learnable confidence model to detect indicative guidance from context, and further propose a disentangled regularization to mitigate the over-reliance problem. Experimental results on GLUE and CLUE benchmarks show that TDT gives consistently better results than fine-tuning with different PLMs, and extensive analysis demonstrates the effectiveness and robustness of our method. Code is available at https://github.com/lemon0830/TDT.
引用
收藏
页码:3126 / 3137
页数:12
相关论文
共 50 条
  • [31] A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models
    Zhou, Yi
    Camacho-Collados, Jose
    Bollegala, Danushka
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 11082 - 11100
  • [32] Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification
    Chen, Ting
    Sun, Yizhou
    WSDM'17: PROCEEDINGS OF THE TENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2017, : 295 - 304
  • [33] TGEditor: Task-Guided Graph Editing for Augmenting Temporal Financial Transaction Networks
    Zhang, Shuaicheng
    Zhu, Yada
    Zhou, Dawei
    PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON AI IN FINANCE, ICAIF 2023, 2023, : 219 - 226
  • [34] Measuring and Improving Consistency in Pretrained Language Models
    Elazar, Yanai
    Kassner, Nora
    Ravfogel, Shauli
    Ravichander, Abhilasha
    Hovy, Eduard
    Schutze, Hinrich
    Goldberg, Yoav
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 1012 - 1031
  • [35] Controlling the Focus of Pretrained Language Generation Models
    Ji, Jiabao
    Kim, Yoon
    Glass, James
    He, Tianxing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3291 - 3306
  • [36] Language Recognition Based on Unsupervised Pretrained Models
    Yu, Haibin
    Zhao, Jing
    Yang, Song
    Wu, Zhongqin
    Nie, Yuting
    Zhang, Wei-Qiang
    INTERSPEECH 2021, 2021, : 3271 - 3275
  • [37] Two-Stage Fine-Tuning for Improved Bias and Variance for Large Pretrained Language Models
    Wang, Lijing
    Li, Yingya
    Miller, Timothy
    Bethard, Steven
    Savova, Guergana
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15746 - 15761
  • [38] Constructing Taxonomies from Pretrained Language Models
    Chen, Catherine
    Lin, Kevin
    Klein, Dan
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 4687 - 4700
  • [39] Factual Consistency of Multilingual Pretrained Language Models
    Fierro, Constanza
    Sogaard, Anders
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3046 - 3052
  • [40] Fooling MOSS Detection with Pretrained Language Models
    Biderman, Stella
    Raff, Edward
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 2933 - 2943