On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation

被引:0
|
作者
He, Ruidan [1 ]
Liu, Linlin [1 ,2 ]
Ye, Hai [3 ]
Tan, Qingyu [1 ,3 ]
Ding, Bosheng [1 ,2 ]
Cheng, Liying [1 ,4 ]
Low, Jia-Wei [1 ,2 ]
Bing, Lidong [1 ]
Si, Luo [1 ]
机构
[1] Alibaba Grp, DAMO Acad, Hangzhou, Peoples R China
[2] Nanyang Technol Univ, Singapore, Singapore
[3] Natl Univ Singapore, Singapore, Singapore
[4] Singapore Univ Technol & Design, Singapore, Singapore
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adapter-based tuning has recently arisen as an alternative to fine-tuning. It works by adding light-weight adapter modules to a pretrained language model (PrLM) and only updating the parameters of adapter modules when learning on a downstream task. As such, it adds only a few trainable parameters per new task, allowing a high degree of parameter sharing. Prior studies have shown that adapter-based tuning often achieves comparable results to finetuning. However, existing work only focuses on the parameter-efficient aspect of adapterbased tuning while lacking further investigation on its effectiveness. In this paper, we study the latter. We first show that adapterbased tuning better mitigates forgetting issues than fine-tuning since it yields representations with less deviation from those generated by the initial PrLM. We then empirically compare the two tuning methods on several downstream NLP tasks and settings. We demonstrate that 1) adapter-based tuning outperforms fine-tuning on low-resource and cross-lingual tasks; 2) it is more robust to overfitting and less sensitive to changes in learning rates.
引用
收藏
页码:2208 / 2222
页数:15
相关论文
共 50 条
  • [21] Adapter-Based Selective Knowledge Distillation for Federated Multi-Domain Meeting Summarization
    Feng, Xiachong
    Feng, Xiaocheng
    Du, Xiyuan
    Kan, Min-Yen
    Qin, Bing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3694 - 3708
  • [22] An Unsupervised Clinical Acronym Disambiguation Method Based on Pretrained Language Model
    Wei, Siwen
    Yuan, Chi
    Li, Zixuan
    Wang, Huaiyu
    HEALTH INFORMATION PROCESSING, CHIP 2023, 2023, 1993 : 270 - 284
  • [23] Towards a Universal CDAR Device: A High-Performance Adapter-Based Inline Media Encryptor
    Nahill, Benjamin
    Mills, Aaron
    Kiernicki, Martin
    Wilson, David A.
    Vai, Michael
    Khazan, Roger
    Sherer, John
    MILCOM 2017 - 2017 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2017, : 689 - 694
  • [24] Exploring Adapter-based Transfer Learning for Recommender Systems: Empirical Studies and Practical Insights
    Fu, Junchen
    Yuan, Fajie
    Song, Yu
    Yuan, Zheng
    Cheng, Mingyue
    Cheng, Shenghui
    Zhang, Jiaqi
    Wang, Jie
    Pan, Yunzhu
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 208 - 217
  • [25] Distilling a Pretrained Language Model to a Multilingual ASR Model
    Choi, Kwanghee
    Park, Hyung-Min
    INTERSPEECH 2022, 2022, : 2203 - 2207
  • [26] Chinese Prosodic Structure Prediction Based on a Pretrained Language Representation Model
    Zhang P.
    Lu C.
    Wang R.
    Zhang, Pengyuan (zhangpengyuan@hccl.ioa.ac.cn), 1600, Tianjin University (53): : 265 - 271
  • [27] HHU at SemEval-2023 Task 3: An Adapter-based Approach for News Genre Classification
    Billert, Fabian
    Conrad, Stefan
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1166 - 1171
  • [28] Adapt-cMolGPT: A Conditional Generative Pre-Trained Transformer with Adapter-Based Fine-Tuning for Target-Specific Molecular Generation
    Yoo, Soyoung
    Kim, Junghyun
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (12)
  • [29] Pretrained Language Model Embryology: The Birth of ALBERT
    Chiang, Cheng-Han
    Huang, Sung-Feng
    Lee, Hung-Yi
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6813 - 6828
  • [30] SCIBERT: A Pretrained Language Model for Scientific Text
    Beltagy, Iz
    Lo, Kyle
    Cohan, Arman
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3615 - 3620