Investigating Transferability in Pretrained Language Models

被引:0
|
作者
Tamkin, Alex [1 ]
Singh, Trisha [1 ]
Giovanardi, Davide [1 ]
Goodman, Noah [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
How does language model pretraining help transfer learning? We consider a simple ablation technique for determining the impact of each pretrained layer on transfer task performance. This method, partial reinitialization, involves replacing different layers of a pretrained model with random weights, then finetuning the entire model on the transfer task and observing the change in performance. This technique reveals that in BERT, layers with high probing performance on downstream GLUE tasks are neither necessary nor sufficient for high accuracy on those tasks. Furthermore, the benefit of using pretrained parameters for a layer varies dramatically with finetuning dataset size: parameters that provide tremendous performance improvement when data is plentiful may provide negligible benefits in data-scarce settings. These results reveal the complexity of the transfer learning process, highlighting the limitations of methods that operate on frozen models or single data samples.
引用
收藏
页码:1393 / 1401
页数:9
相关论文
共 50 条
  • [1] Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models
    Kassner, Nora
    Dufter, Philipp
    Schutze, Hinrich
    [J]. 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 3250 - 3258
  • [2] A Survey of Pretrained Language Models
    Sun, Kaili
    Luo, Xudong
    Luo, Michael Y.
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II, 2022, 13369 : 442 - 456
  • [3] Generating Datasets with Pretrained Language Models
    Schick, Timo
    Schuetze, Hinrich
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6943 - 6951
  • [4] Geographic Adaptation of Pretrained Language Models
    Hofmann, Valentin
    Glavas, Goran
    Ljubesic, Nikola
    Pierrehumbert, Janet B.
    Schuetze, Hinrich
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2024, 12 : 411 - 431
  • [5] Discourse Probing of Pretrained Language Models
    Koto, Fajri
    Lau, Jey Han
    Baldwin, Timothy
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 3849 - 3864
  • [6] Textually Pretrained Speech Language Models
    Hassid, Michael
    Remez, Tal
    Nguyen, Tu Anh
    Gat, Itai
    Conneau, Alexis
    Kreuk, Felix
    Copet, Jade
    Defossez, Alexandre
    Synnaeve, Gabriel
    Dupoux, Emmanuel
    Schwartz, Roy
    Adi, Yossi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] Unsupervised Paraphrasing with Pretrained Language Models
    Niu, Tong
    Yavuz, Semih
    Zhou, Yingbo
    Keskar, Nitish Shirish
    Wang, Huan
    Xiong, Caiming
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 5136 - 5150
  • [8] Measuring and Improving Consistency in Pretrained Language Models
    Elazar, Yanai
    Kassner, Nora
    Ravfogel, Shauli
    Ravichander, Abhilasha
    Hovy, Eduard
    Schutze, Hinrich
    Goldberg, Yoav
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 1012 - 1031
  • [9] Controlling the Focus of Pretrained Language Generation Models
    Ji, Jiabao
    Kim, Yoon
    Glass, James
    He, Tianxing
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3291 - 3306
  • [10] Language Recognition Based on Unsupervised Pretrained Models
    Yu, Haibin
    Zhao, Jing
    Yang, Song
    Wu, Zhongqin
    Nie, Yuting
    Zhang, Wei-Qiang
    [J]. INTERSPEECH 2021, 2021, : 3271 - 3275