Prefix-Propagation: Parameter-Efficient Tuning for Long Sequences

被引:0
|
作者
Li, Jonathan [2 ]
Aitken, Will [1 ,2 ]
Bhambhoria, Rohan [1 ,2 ]
Zhu, Xiaodan [1 ,2 ]
机构
[1] Queens Univ, Dept Elect & Comp Engn, Kingston, ON, Canada
[2] Queens Univ, Ingenu Labs Res Inst, Kingston, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Parameter-efficient tuning aims to mitigate the large memory requirements of adapting pretrained language models for downstream tasks. For example, one popular method, prefix-tuning (Li and Liang, 2021; Liu et al., 2022), prepends trainable tokens to sequences while freezing the rest of the model's parameters. Although such models attain comparable performance with fine-tuning when applied to sequences with short to moderate lengths, we show their inferior performance when modelling long sequences. To bridge this gap, we propose prefix-propagation, a simple but effective approach that conditions prefixes on previous hidden states. We empirically demonstrate that prefix-propagation outperforms prefix-tuning across long-document tasks, while using similar to 50% fewer parameters. To further investigate the proposed architecture, we also show its advantage in calibration, and perform additional study on its relationship with kernel attention. To the best of our knowledge, this work is the first to focus on parameter-efficient learning for long-sequence language tasks.(1)
引用
收藏
页码:1408 / 1419
页数:12
相关论文
共 50 条
  • [1] Towards Adaptive Prefix Tuning for Parameter-Efficient Language Model Fine-tuning
    Zhang, Zhen-Ru
    Tan, Chuanqi
    Xu, Haiyang
    Wang, Chengyu
    Huang, Jun
    Huang, Songfang
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 1239 - 1248
  • [2] The Power of Scale for Parameter-Efficient Prompt Tuning
    Lester, Brian
    Al-Rfou, Rami
    Constant, Noah
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3045 - 3059
  • [3] Parameter-Efficient Tuning with Special Token Adaptation
    Yang, Xiaocong
    Huang, James Y.
    Zhou, Wenxuan
    Chen, Muhao
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 865 - 872
  • [4] On the Effectiveness of Parameter-Efficient Fine-Tuning
    Fu, Zihao
    Yang, Haoran
    So, Anthony Man-Cho
    Lam, Wai
    Bing, Lidong
    Collier, Nigel
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 12799 - 12807
  • [5] Stochastic Bridges as Effective Regularizers for Parameter-Efficient Tuning
    Chen, Weize
    Han, Xu
    Lin, Yankai
    Liu, Zhiyuan
    Sun, Maosong
    Zhou, Jie
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 10400 - 10420
  • [6] Exploring the Impact of Model Scaling on Parameter-efficient Tuning
    Su, Yusheng
    Chan, Chi-Min
    Chen, Jiali
    Qin, Yujia
    Lin, Yankai
    Hu, Shengding
    Yang, Zonghan
    Ding, Ning
    Sun, Xingzhi
    Xu, Guotong
    Liu, Zhiyuan
    Sun, Maosong
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 15062 - 15078
  • [7] Prompt tuning for parameter-efficient medical image segmentation
    Fischer, Marc
    Bartler, Alexander
    Yang, Bin
    MEDICAL IMAGE ANALYSIS, 2024, 91
  • [8] A Gradient Control Method for Backdoor Attacks on Parameter-Efficient Tuning
    Gu, Naibin
    Fu, Peng
    Liu, Xiyu
    Liu, Zhengxiao
    Lin, Zheng
    Wang, Weiping
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3508 - 3520
  • [9] A Unified Continual Learning Framework with General Parameter-Efficient Tuning
    Gao, Qiankun
    Zhao, Chen
    Sun, Yifan
    Xi, Teng
    Zhang, Gang
    Ghanem, Bernard
    Zhang, Jian
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11449 - 11459
  • [10] Frozen Weights as Prior for Parameter-Efficient Fine-Tuning
    Ma, Xiaolong
    Liu, Peishun
    Gao, Haojie
    Yan, Zikang
    Ma, Ningning
    Liu, Wenqiang
    Wang, Xuefang
    Tang, Ruichun
    IEEE ACCESS, 2025, 13 : 24411 - 24425