Adaptable Adapters

被引:0
|
作者
Moosavi, Nafise Sadat [1 ]
Delfosse, Quentin [3 ]
Kersting, Kristian [2 ,3 ]
Gurevych, Iryna [2 ,4 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield, S Yorkshire, England
[2] Tech Univ Darmstadt, Hessian Ctr AI Hessian AI, Darmstadt, Germany
[3] Tech Univ Darmstadt, AI & Machine Learning Lab, Darmstadt, Germany
[4] Tech Univ Darmstadt, Ubiquitous Knowledge Proc Lab UKP Lab, Dept Comp Sci, Darmstadt, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters provide a parameter-efficient alternative for the full finetuning in which we can only finetune lightweight neural network layers on top of pretrained weights. Adapter layers are initialized randomly. However, existing work uses the same adapter architecture-i.e., the same adapter layer on top of each layer of the pretrained model-for every dataset, regardless of the properties of the dataset or the amount of available training data. In this work, we introduce adaptable adapters that contain (1) learning different activation functions for different layers and different input data, and (2) a learnable switch to select and only use the beneficial adapter layers. We show that adaptable adapters achieve on-par performances with the standard adapter architecture while using a considerably smaller number of adapter layers. In addition, we show that the selected adapter architecture by adaptable adapters transfers well across different data settings and similar tasks. We propose to use adaptable adapters for designing efficient and effective adapter architectures. The resulting adapters (a) contain about 50% of the learning parameters of the standard adapter and are therefore more efficient at training and inference, and require less storage space, and (b) achieve considerably higher performances in low-data settings.(1)
引用
收藏
页码:3742 / 3753
页数:12
相关论文
共 50 条
  • [21] Science Adapters Wanted
    Alberts, Bruce
    SCIENCE, 2011, 334 (6059) : 1031 - 1031
  • [22] CLATHRIN, ADAPTERS, AND SORTING
    PEARSE, BMF
    ROBINSON, MS
    ANNUAL REVIEW OF CELL BIOLOGY, 1990, 6 : 151 - 171
  • [23] Researchers shrink adapters
    不详
    POWER ENGINEER, 2004, 18 (01): : 7 - 7
  • [24] On the Efficacy of Sampling Adapters
    Meister, Clara
    Pimentel, Tiago
    Malagutti, Luca
    Wilcox, Ethan G.
    Cotterell, Ryan
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1437 - 1453
  • [25] WIRELESS LAN ADAPTERS
    ANDREWS, D
    BYTE, 1993, 18 (09): : 182 - 182
  • [26] 2 ADAPTERS FOR PROJECTION PHOTOGRAPHY
    不详
    SKY AND TELESCOPE, 1972, 43 (02): : 123 - &
  • [27] LAN ADAPTERS - PERFORMANCE PLUS
    KINNUCAN, P
    SYSTEMS INTEGRATION BUSINESS, 1991, 24 (03): : 78 - &
  • [28] Transparent dissemination of adapters in Jini
    Vayssière, J
    DOA'01: 3RD INTERNATIONAL SYMPOSIUM ON DISTRIBUTED OBJECTS & APPLICATIONS, PROCEEDINGS, 2001, : 95 - 104
  • [29] Tough adapters protect sensors
    不详
    HYDRAULICS & PNEUMATICS, 1999, 52 (10) : 16 - +
  • [30] 2-WAY ADAPTERS
    DIETTRICH, O
    NATURE, 1993, 362 (6422) : 690 - 690