Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

被引:55
|
作者
Ganesh, Prakhar [1 ]
Chen, Yao [1 ]
Lou, Xin [1 ]
Khan, Mohammad Ali [1 ]
Yang, Yin [2 ]
Sajjad, Hassan [3 ]
Nakov, Preslav [3 ]
Chen, Deming [4 ]
Winslett, Marianne [4 ]
机构
[1] Adv Digital Sci Ctr, Singapore, Singapore
[2] Hamad Bin Khalifa Univ, Coll Sci & Engn, Ar Rayyan, Qatar
[3] Hamad Bin Khalifa Univ, Qatar Comp Res Inst, Ar Rayyan, Qatar
[4] Univ Illinois, Urbana, IL USA
基金
新加坡国家研究基金会;
关键词
82;
D O I
10.1162/tacl_a_00413
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks. However, these models often have billions of parameters, and thus are too resourcehungry and computation-intensive to suit lowcapability devices or applications with strict latency requirements. One potential remedy for this is model compression, which has attracted considerable research attention. Here, we summarize the research in compressing Transformers, focusing on the especially popular BERT model. In particular, we survey the state of the art in compression for BERT, we clarify the current best practices for compressing large-scale Transformer models, and we provide insights into the workings of various methods. Our categorization and analysis also shed light on promising future research directions for achieving lightweight, accurate, and generic NLP models.
引用
收藏
页码:1061 / 1080
页数:20
相关论文
共 50 条
  • [1] An Architecture for Accelerated Large-Scale Inference of Transformer-Based Language Models
    Ganiev, Amir
    Chapin, Colt
    de Andrade, Anderson
    Liu, Chen
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2021, 2021, : 163 - 169
  • [2] TRANSFORMER IN ACTION: A COMPARATIVE STUDY OF TRANSFORMER-BASED ACOUSTIC MODELS FOR LARGE SCALE SPEECH RECOGNITION APPLICATIONS
    Wang, Yongqiang
    Shi, Yangyang
    Zhang, Frank
    Wu, Chunyang
    Chan, Julian
    Yeh, Ching-Feng
    Xiao, Alex
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6778 - 6782
  • [3] Transformer-based biomarker prediction from colorectal cancer histology: A large-scale multicentric study
    Wagner, Sophia J.
    Reisenbuechler, Daniel
    West, Nicholas P.
    Niehues, Jan Moritz
    Zhu, Jiefu
    Foersch, Sebastian
    Veldhuizen, Gregory Patrick
    Quirke, Philip
    Grabsch, Heike I.
    van den Brandt, Piet A.
    Hutchins, Gordon G. A.
    Richman, Susan D.
    Yuan, Tanwei
    Langer, Rupert
    Jenniskens, Josien C. A.
    Offermans, Kelly
    Mueller, Wolfram
    Gray, Richard
    Gruber, Stephen B.
    Greenson, Joel K.
    Rennert, Gad
    Bonner, Joseph D.
    Schmolze, Daniel
    Jonnagaddala, Jitendra
    Hawkins, Nicholas J.
    Ward, Robyn L.
    Morton, Dion
    Seymour, Matthew
    Magill, Laura
    Nowak, Marta
    Hay, Jennifer
    Koelzer, Viktor H.
    Church, David N.
    Matek, Christian
    Geppert, Carol
    Peng, Chaolong
    Zhi, Cheng
    Ouyang, Xiaoming
    James, Jacqueline A.
    Loughrey, Maurice B.
    Salto-Tellez, Manuel
    Brenner, Hermann
    Hoffmeister, Michael
    Truhn, Daniel
    Schnabel, Julia A.
    Boxberg, Melanie
    Peng, Tingying
    Kather, Jakob Nikolas
    [J]. CANCER CELL, 2023, 41 (09) : 1650 - +
  • [4] AccTFM: An Effective Intra-Layer Model Parallelization Strategy for Training Large-Scale Transformer-Based Models
    Zeng, Zihao
    Liu, Chubo
    Tang, Zhuo
    Li, Kenli
    Li, Keqin
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 4326 - 4338
  • [5] Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings
    Prakash, Prafull
    Shashidhar, Saurabh Kumar
    Zhao, Wenlong
    Rongali, Subendhu
    Khan, Haidar
    Kayser, Michael
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 4711 - 4717
  • [6] Transformer-based Language Models and Homomorphic Encryption: An Intersection with BERT-tiny
    Rovida, Lorenzo
    Leporati, Alberto
    [J]. PROCEEDINGS OF THE 10TH ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS, IWSPA 2024, 2024, : 3 - 13
  • [7] Cascaded transformer-based networks for wikipedia large-scale image-caption matching
    Messina, Nicola
    Coccomini, Davide Alessandro
    Esuli, Andrea
    Falchi, Fabrizio
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (23) : 62915 - 62935
  • [8] UAV Cross-Modal Image Registration: Large-Scale Dataset and Transformer-Based Approach
    Xiao, Yun
    Liu, Fei
    Zhu, Yabin
    Li, Chenglong
    Wang, Futian
    Tang, Jin
    [J]. ADVANCES IN BRAIN INSPIRED COGNITIVE SYSTEMS, BICS 2023, 2024, 14374 : 166 - 176
  • [9] Large-scale chemical process causal discovery from big data with transformer-based deep learning
    Bi, Xiaotian
    Wu, Deyang
    Xie, Daoxiong
    Ye, Huawei
    Zhao, Jinsong
    [J]. PROCESS SAFETY AND ENVIRONMENTAL PROTECTION, 2023, 173 : 163 - 177
  • [10] A lightweight Transformer-based neural network for large-scale masonry arch bridge point cloud segmentation
    Jing, Yixiong
    Sheil, Brian
    Acikgoz, Sinan
    [J]. COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2024, 39 (16) : 2427 - 2438