Prefix-Propagation: Parameter-Efficient Tuning for Long Sequences

被引:0
|
作者
Li, Jonathan [2 ]
Aitken, Will [1 ,2 ]
Bhambhoria, Rohan [1 ,2 ]
Zhu, Xiaodan [1 ,2 ]
机构
[1] Queens Univ, Dept Elect & Comp Engn, Kingston, ON, Canada
[2] Queens Univ, Ingenu Labs Res Inst, Kingston, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Parameter-efficient tuning aims to mitigate the large memory requirements of adapting pretrained language models for downstream tasks. For example, one popular method, prefix-tuning (Li and Liang, 2021; Liu et al., 2022), prepends trainable tokens to sequences while freezing the rest of the model's parameters. Although such models attain comparable performance with fine-tuning when applied to sequences with short to moderate lengths, we show their inferior performance when modelling long sequences. To bridge this gap, we propose prefix-propagation, a simple but effective approach that conditions prefixes on previous hidden states. We empirically demonstrate that prefix-propagation outperforms prefix-tuning across long-document tasks, while using similar to 50% fewer parameters. To further investigate the proposed architecture, we also show its advantage in calibration, and perform additional study on its relationship with kernel attention. To the best of our knowledge, this work is the first to focus on parameter-efficient learning for long-sequence language tasks.(1)
引用
收藏
页码:1408 / 1419
页数:12
相关论文
共 50 条
  • [41] Parameter-Efficient Fine-Tuning Method for Task-Oriented Dialogue Systems
    Mo, Yunho
    Yoo, Joon
    Kang, Sangwoo
    MATHEMATICS, 2023, 11 (14)
  • [42] Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning
    Kim, Yeachan
    Kim, Junho
    Lee, SangKeun
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 5922 - 5936
  • [43] Parameter-Efficient Language Model Tuning with Active Learning in Low-Resource Settings
    Jukic, Josip
    Snajder, Jan
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 5061 - 5074
  • [44] Characterizing Communication in Distributed Parameter-Efficient Fine-Tuning for Large Language Models
    Alnaasan, Nawras
    Huang, Horng-Ruey
    Shafi, Aamir
    Subramoni, Hari
    Panda, Dhabaleswar K.
    2024 IEEE SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS, HOTI 2024, 2024, : 11 - 19
  • [45] Shadclips: When Parameter-Efficient Fine-Tuning with Multimodal Meets Shadow Removal
    Zhang, Xiaofeng
    Gu, Chaochen
    Xu, Zishan
    Tang, Hao
    Cheng, Hao
    Wu, Kaijie
    Zhu, Shanying
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (16)
  • [46] Parameter-Efficient Fine-Tuning of Large Pretrained Models for Instance Segmentation Tasks
    Baker, Nermeen Abou
    Rohrschneider, David
    Handmann, Uwe
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (04): : 2783 - 2807
  • [47] SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels
    Henry Hengyuan Zhao
    Pichao Wang
    Yuyang Zhao
    Hao Luo
    Fan Wang
    Mike Zheng Shou
    International Journal of Computer Vision, 2024, 132 : 731 - 749
  • [48] ADT: An Additive Delta-Tuning approach for parameter-efficient tuning in pre-trained language models
    Li, Dong
    Tang, Jintao
    Li, Shasha
    Wang, Ting
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 382 - 386
  • [49] Parameter-efficient deep probabilistic forecasting
    Sprangers, Olivier
    Schelter, Sebastian
    de Rijke, Maarten
    INTERNATIONAL JOURNAL OF FORECASTING, 2023, 39 (01) : 332 - 345
  • [50] Parameter-Efficient Transfer Learning for NLP
    Houlsby, Neil
    Giurgiu, Andrei
    Jastrzebski, Stanislaw
    Morrone, Bruna
    de laroussilhe, Quentin
    Gesmundo, Andrea
    Attariyan, Mona
    Gelly, Sylvain
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97