ATTENTION PROBE: VISION TRANSFORMER DISTILLATION IN THE WILD

被引:1
|
作者
Wang, Jiahao [1 ]
Cao, Mingdeng [1 ]
Shi, Shuwei [1 ]
Wu, Baoyuan [2 ]
Yang, Yujiu [1 ]
机构
[1] Tsinghua Shenzhen Int Grad Sch, Shenzhen, Peoples R China
[2] Chinese Univ Hong Kong, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformer; data-free; distillation;
D O I
10.1109/ICASSP43922.2022.9747484
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Vision transformers (ViTs) require intensive computational resources to achieve high performance, which usually makes them not suitable for mobile devices. A feasible strategy is to compress them using the original training data, which may be not accessible due to privacy limitations or transmission restrictions. In this case, utilizing the massive unlabeled data in the wild is an alternative paradigm, which has been proved effective for compressing convolutional neural networks (CNNs). However, due to the significant differences in model structure and computation mechanism between CNNs and ViTs, it is still an open issue that whether the similar paradigm is suitable for ViTs. In this work, we propose to effectively compress ViTs using the unlabeled data in the wild, consisting of two stages. First, we design an effective tool in selecting valuable data from the wild, dubbed Attention Probe. Second, based on the selected data, we develop a probe knowledge distillation algorithm to train a lightweight student transformer, through maximizing the similarities on both the outputs and intermediate features, between the heavy teacher and the lightweight student models. Extensive experimental results on several benchmarks demonstrate that the student transformer obtained by the proposed method can achieve comparable performance with the baseline that requires the original training data. Code is available at: https://github.com/IIGROUP/AttentionProbe.
引用
收藏
页码:2220 / 2224
页数:5
相关论文
共 50 条
  • [21] Frequency Domain Distillation for Data-Free Quantization of Vision Transformer
    Nan, Gongrui
    Chao, Fei
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII, 2024, 14432 : 205 - 216
  • [22] FSwin Transformer: Feature-Space Window Attention Vision Transformer for Image Classification
    Yoo, Dayeon
    Kim, Jeesu
    Yoo, Jinwoo
    IEEE ACCESS, 2024, 12 : 72598 - 72606
  • [23] BiFormer: Vision Transformer with Bi-Level Routing Attention
    Zhu, Lei
    Wang, Xinjiang
    Ke, Zhanghan
    Zhang, Wayne
    Lau, Rynson
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10323 - 10333
  • [24] SPARO: Selective Attention for Robust and Compositional Transformer Encodings for Vision
    Vani, Ankit
    Nguyen, Bac
    Lavoie, Samuel
    Krishna, Ranjay
    Courville, Aaron
    COMPUTER VISION - ECCV 2024, PT LXVI, 2025, 15124 : 233 - 251
  • [25] ViTVO: Vision Transformer based Visual Odometry with Attention Supervision
    Chiu, Chu-Chi
    Yang, Hsuan-Kung
    Chen, Hao-Wei
    Chen, Yu-Wen
    Lee, Chun-Yi
    2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
  • [26] EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention
    Liu, Xinyu
    Peng, Houwen
    Zheng, Ningxin
    Yang, Yuqing
    Hu, Han
    Yuan, Yixuan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14420 - 14430
  • [27] An Arrhythmia Classification Model Based on Vision Transformer with Deformable Attention
    Dong, Yanfang
    Zhang, Miao
    Qiu, Lishen
    Wang, Lirong
    Yu, Yong
    MICROMACHINES, 2023, 14 (06)
  • [28] Vision Transformer Based on Reconfigurable Gaussian Self-attention
    Zhao L.
    Zhou J.-K.
    Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (09): : 1976 - 1988
  • [29] Patch Attacks on Vision Transformer via Skip Attention Gradients
    Deng, Haoyu
    Fang, Yanmei
    Huang, Fangjun
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VIII, 2025, 15038 : 554 - 567
  • [30] FAM: Improving columnar vision transformer with feature attention mechanism
    Huang, Lan
    Bai, Xingyu
    Zeng, Jia
    Yu, Mengqiang
    Pang, Wei
    Wang, Kangping
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 242