Self-Supervised Quantization-Aware Knowledge Distillation

被引:0
|
作者
Zhao, Kaiqi [1 ]
Zhao, Ming [1 ]
机构
[1] Arizona State Univ, Tempe, AZ 85287 USA
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Quantization-aware training (QAT) and Knowledge Distillation (KD) are combined to achieve competitive performance in creating low-bit deep learning models. However, existing works applying KD to QAT require tedious hyper-parameter tuning to balance the weights of different loss terms, assume the availability of labeled training data, and require complex, computationally intensive training procedures for good performance. To address these limitations, this paper proposes a novel Self-Supervised Quantization-Aware Knowledge Distillation (SQAKD) framework. SQAKD first unifies the forward and backward dynamics of various quantization functions, making it flexible for incorporating various QAT works. Then it formulates QAT as a co-optimization problem that simultaneously minimizes the KL-Loss between the full-precision and low-bit models for KD and the discretization error for quantization, without supervision from labels. A comprehensive evaluation shows that SQAKD substantially outperforms the state-of-the-art QAT and KD works for a variety of model architectures. Our code is at: https: //github.com/kaiqi123/SQAKD.git.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] More from Less: Self-supervised Knowledge Distillation for Routine Histopathology Data
    Farndale, Lucas
    Insall, Robert
    Yuan, Ke
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT I, 2024, 14348 : 454 - 463
  • [22] FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
    Lee, Yeonghyeon
    Jang, Kangwook
    Goo, Jahyun
    Jung, Youngmoon
    Kim, Hoirin
    INTERSPEECH 2022, 2022, : 3588 - 3592
  • [23] COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers
    Denize, Julien
    Liashuha, Mykola
    Rabarisoa, Jaonary
    Orcesi, Astrid
    Herault, Romain
    2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 518 - 528
  • [24] Overcoming Oscillations in Quantization-Aware Training
    Nagel, Markus
    Fournarakis, Marios
    Bondarenko, Yelysei
    Blankevoort, Tijmen
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [25] Category contrastive distillation with self-supervised classification
    Chen, Weiwei
    Xu, Jiazhen
    Zheng, Yujie
    Wang, Chong
    SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
  • [26] Initiative-Aware Self-Supervised Learning for Knowledge-Grounded Conversations
    Meng, Chuan
    Ren, Pengjie
    Chen, Zhumin
    Ren, Zhaochun
    Xi, Tengxiao
    de Rijke, Maarten
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 522 - 532
  • [27] Emotion-Aware Speech Self-Supervised Representation Learning with Intensity Knowledge
    Liu, Rui
    Ma, Zening
    INTERSPEECH 2024, 2024, : 3180 - 3184
  • [28] Emotion-Aware Speech Self-Supervised Representation Learning with Intensity Knowledge
    Liu, Rui
    Ma, Zening
    arXiv,
  • [29] Improving Self-supervised Lightweight Model Learning via Hard-Aware Metric Distillation
    Liu, Hao
    Ye, Mang
    COMPUTER VISION, ECCV 2022, PT XXXI, 2022, 13691 : 295 - 311
  • [30] MAL: Motion-Aware Loss with Temporal and Distillation Hints for Self-Supervised Depth Estimation
    Dong, Yue-Jiang
    Zhang, Fang-Lue
    Zhang, Song-Hai
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 7318 - 7324