Deep Compression of Pre-trained Transformer Models

被引:0
|
作者
Wang, Naigang [1 ]
Liu, Chi-Chun [1 ]
Venkataramani, Swagath [1 ]
Sen, Sanchari [1 ]
Chen, Chia-Yu [1 ]
El Maghraoui, Kaoutar [1 ]
Srinivasan, Vijayalakshmi [1 ]
Chang, Leland [1 ]
机构
[1] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained transformer models have achieved remarkable success in natural language processing (NLP) and have recently become competitive alternatives to Convolution Neural Networks (CNN) and Recurrent Neural Networks (RNN) in vision and speech tasks, respectively. Due to their excellent computational efficiency and scalability, transformer models can be trained on exceedingly large amounts of data at the expense of tremendous growth in model size. As high performance, large-scale, and pre-trained transformer models become increasingly available for users to download and fine-tune for customized downstream tasks, their deployment becomes challenging due to the vast amount of operations and large memory footprint. To address this challenge, we introduce methods to deeply compress pre-trained transformer models across three major application domains: NLP, speech, and vision. Specifically, we quantize transformer backbones down to 4-bit and further achieve 50% fine-grained structural sparsity on pre-trained BERT, Wav2vec2.0, and Vision Transformer (ViT) models to demonstrate 16x compression while maintaining model accuracy. This is achieved by identifying critical initialization strategies for quantization- and sparsity- aware fine-tuning as well as developing novel techniques such as quantizers with a zero-preserving format and scheduled dropout. These hardware-friendly techniques need only to be applied in the fine-tuning phase for downstream tasks, which renders them especially suitable for acceleration and deployment of pre-trained transformer models.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Brain MRI classification for tumor detection with deep pre-trained models
    Yazzeoui, Ameni
    Oueslati, Afef Elloumi
    2024 IEEE 7TH INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES, SIGNAL AND IMAGE PROCESSING, ATSIP 2024, 2024, : 182 - 187
  • [22] Enhanced audio classification leveraging pre-trained deep visual models
    Kumar, Arvind
    Kumar, Rampravesh
    Chandra, Mahesh
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
  • [23] Classification and Analysis of Pistachio Species with Pre-Trained Deep Learning Models
    Singh, Dilbag
    Taspinar, Yavuz Selim
    Kursun, Ramazan
    Cinar, Ilkay
    Koklu, Murat
    Ozkan, Ilker Ali
    Lee, Heung-No
    ELECTRONICS, 2022, 11 (07)
  • [24] Pre-trained deep learning models for brain MRI image classification
    Krishnapriya, Srigiri
    Karuna, Yepuganti
    FRONTIERS IN HUMAN NEUROSCIENCE, 2023, 17
  • [25] Quality of Pre-trained Deep-Learning Models for Palmprint Recognition
    Rosca, Valentin
    Ignat, Anca
    2020 22ND INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2020), 2020, : 202 - 209
  • [26] Mass detection in mammograms using pre-trained deep learning models
    Agarwal, Richa
    Diaz, Oliver
    Llado, Xavier
    Marti, Robert
    14TH INTERNATIONAL WORKSHOP ON BREAST IMAGING (IWBI 2018), 2018, 10718
  • [27] Federated Scaling of Pre-trained Models for Deep Facial Expression Recognition
    Srihitha, P. V. N. Pooja
    Verma, Mridula
    Prasad, Munaga V. N. K.
    COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT III, 2024, 2011 : 90 - 101
  • [28] An Approach to Run Pre-Trained Deep Learning Models on Grayscale Images
    Ahmad, Ijaz
    Shin, Seokjoo
    3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (IEEE ICAIIC 2021), 2021, : 177 - 180
  • [29] Refining Pre-Trained Motion Models
    Sun, Xinglong
    Harley, Adam W.
    Guibas, Leonidas J.
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 4932 - 4938
  • [30] Efficiently Robustify Pre-Trained Models
    Jain, Nishant
    Behl, Harkirat
    Rawat, Yogesh Singh
    Vineet, Vibhav
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5482 - 5492