Prefix-Propagation: Parameter-Efficient Tuning for Long Sequences

被引：0

作者：

Li, Jonathan ^{[2
]}

Aitken, Will ^{[1
,2
]}

Bhambhoria, Rohan ^{[1
,2
]}

Zhu, Xiaodan ^{[1
,2
]}

机构：

[1] Queens Univ, Dept Elect & Comp Engn, Kingston, ON, Canada

[2] Queens Univ, Ingenu Labs Res Inst, Kingston, ON, Canada

来源：

61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2 | 2023年

基金：

加拿大自然科学与工程研究理事会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Parameter-efficient tuning aims to mitigate the large memory requirements of adapting pretrained language models for downstream tasks. For example, one popular method, prefix-tuning (Li and Liang, 2021; Liu et al., 2022), prepends trainable tokens to sequences while freezing the rest of the model's parameters. Although such models attain comparable performance with fine-tuning when applied to sequences with short to moderate lengths, we show their inferior performance when modelling long sequences. To bridge this gap, we propose prefix-propagation, a simple but effective approach that conditions prefixes on previous hidden states. We empirically demonstrate that prefix-propagation outperforms prefix-tuning across long-document tasks, while using similar to 50% fewer parameters. To further investigate the proposed architecture, we also show its advantage in calibration, and perform additional study on its relationship with kernel attention. To the best of our knowledge, this work is the first to focus on parameter-efficient learning for long-sequence language tasks.(1)

引用

页码：1408 / 1419

页数：12

共 50 条

[41] Parameter-Efficient Fine-Tuning Method for Task-Oriented Dialogue Systems
Mo, Yunho
Yoo, Joon
Kang, Sangwoo
MATHEMATICS, 2023, 11 (14)
[42] Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning
Kim, Yeachan
Kim, Junho
Lee, SangKeun
PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 5922 - 5936
[43] Parameter-Efficient Language Model Tuning with Active Learning in Low-Resource Settings
Jukic, Josip
Snajder, Jan
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 5061 - 5074
[44] Characterizing Communication in Distributed Parameter-Efficient Fine-Tuning for Large Language Models
Alnaasan, Nawras
Huang, Horng-Ruey
Shafi, Aamir
Subramoni, Hari
Panda, Dhabaleswar K.
2024 IEEE SYMPOSIUM ON HIGH-PERFORMANCE INTERCONNECTS, HOTI 2024, 2024, : 11 - 19
[45] Shadclips: When Parameter-Efficient Fine-Tuning with Multimodal Meets Shadow Removal
Zhang, Xiaofeng
Gu, Chaochen
Xu, Zishan
Tang, Hao
Cheng, Hao
Wu, Kaijie
Zhu, Shanying
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (16)
[46] Parameter-Efficient Fine-Tuning of Large Pretrained Models for Instance Segmentation Tasks
Baker, Nermeen Abou
Rohrschneider, David
Handmann, Uwe
MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2024, 6 (04): : 2783 - 2807
[47] SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels
Henry Hengyuan Zhao
Pichao Wang
Yuyang Zhao
Hao Luo
Fan Wang
Mike Zheng Shou
International Journal of Computer Vision, 2024, 132 : 731 - 749
[48] ADT: An Additive Delta-Tuning approach for parameter-efficient tuning in pre-trained language models
Li, Dong
Tang, Jintao
Li, Shasha
Wang, Ting
2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 382 - 386
[49] Parameter-efficient deep probabilistic forecasting
Sprangers, Olivier
Schelter, Sebastian
de Rijke, Maarten
INTERNATIONAL JOURNAL OF FORECASTING, 2023, 39 (01) : 332 - 345
[50] Parameter-Efficient Transfer Learning for NLP
Houlsby, Neil
Giurgiu, Andrei
Jastrzebski, Stanislaw
Morrone, Bruna
de laroussilhe, Quentin
Gesmundo, Andrea
Attariyan, Mona
Gelly, Sylvain
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97

← 1 2 3 4 5 →