On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation

被引:0
|
作者
He, Ruidan [1 ]
Liu, Linlin [1 ,2 ]
Ye, Hai [3 ]
Tan, Qingyu [1 ,3 ]
Ding, Bosheng [1 ,2 ]
Cheng, Liying [1 ,4 ]
Low, Jia-Wei [1 ,2 ]
Bing, Lidong [1 ]
Si, Luo [1 ]
机构
[1] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
[2] Nanyang Technol Univ, Singapore, Singapore
[3] Natl Univ Singapore, Singapore, Singapore
[4] Singapore Univ Technol & Design, Singapore, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adapter-based tuning has recently arisen as an alternative to fine-tuning. It works by adding light-weight adapter modules to a pretrained language model (PrLM) and only updating the parameters of adapter modules when learning on a downstream task. As such, it adds only a few trainable parameters per new task, allowing a high degree of parameter sharing. Prior studies have shown that adapter-based tuning often achieves comparable results to finetuning. However, existing work only focuses on the parameter-efficient aspect of adapterbased tuning while lacking further investigation on its effectiveness. In this paper, we study the latter. We first show that adapterbased tuning better mitigates forgetting issues than fine-tuning since it yields representations with less deviation from those generated by the initial PrLM. We then empirically compare the two tuning methods on several downstream NLP tasks and settings. We demonstrate that 1) adapter-based tuning outperforms fine-tuning on low-resource and cross-lingual tasks; 2) it is more robust to overfitting and less sensitive to changes in learning rates.
引用
收藏
页码:2208 / 2222
页数:15
相关论文
共 50 条
  • [1] Adapter-Based Contextualized Meta Embeddings
    O'Neill, James
    Dutta, Sourav
    GENERALIZING FROM LIMITED RESOURCES IN THE OPEN WORLD, GLOW-IJCAI 2024, 2024, 2160 : 82 - 90
  • [2] Utilization of pre-trained language models for adapter-based knowledge transfer in software engineering
    Saberi, Iman
    Fard, Fatemeh
    Chen, Fuxiang
    EMPIRICAL SOFTWARE ENGINEERING, 2024, 29 (04)
  • [3] ADAPTER-BASED INCREMENTAL LEARNING FOR FACE FORGERY DETECTION
    Gao, Caili
    Xu, Qisheng
    Qiao, Peng
    Xu, Kele
    Qian, Xifu
    Dou, Yong
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 4690 - 4694
  • [4] Geographic Adaptation of Pretrained Language Models
    Hofmann, Valentin
    Glavas, Goran
    Ljubesic, Nikola
    Pierrehumbert, Janet B.
    Schuetze, Hinrich
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 411 - 431
  • [5] Adapter-based fine-tuning of pre-trained multilingual language models for code-mixed and code-switched text classification
    Rathnayake, Himashi
    Sumanapala, Janani
    Rukshani, Raveesha
    Ranathunga, Surangika
    KNOWLEDGE AND INFORMATION SYSTEMS, 2022, 64 (07) : 1937 - 1966
  • [6] Adapter-based fine-tuning of pre-trained multilingual language models for code-mixed and code-switched text classification
    Himashi Rathnayake
    Janani Sumanapala
    Raveesha Rukshani
    Surangika Ranathunga
    Knowledge and Information Systems, 2022, 64 : 1937 - 1966
  • [7] Adapter-Based Extension of Multi-Speaker Text-To-Speech Model for New Speakers
    Hsieh, Cheng-Ping
    Ghosh, Subhankar
    Ginsburg, Boris
    INTERSPEECH 2023, 2023, : 3028 - 3032
  • [8] Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model
    Li, Juntao
    He, Ruidan
    Ye, Hai
    Ng, Hwee Tou
    Bing, Lidong
    Yan, Rui
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3672 - 3678
  • [9] Multitask Fine Tuning on Pretrained Language Model for Retrieval-Based Question Answering in Automotive Domain
    Luo, Zhiyi
    Yan, Sirui
    Luo, Shuyun
    MATHEMATICS, 2023, 11 (12)
  • [10] An Adapter-Based Approach to Co-evolve Generated SQL in Model-to-Text Transformations
    Garcia, Jokin
    Diaz, Oscar
    Cabot, Jordi
    ADVANCED INFORMATION SYSTEMS ENGINEERING (CAISE 2014), 2014, 8484 : 518 - 532