FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning

被引:14
|
作者
Lee, Yeonghyeon [1 ]
Jang, Kangwook [1 ]
Goo, Jahyun [1 ]
Jung, Youngmoon [1 ]
Kim, Hoirin [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon, South Korea
来源
基金
新加坡国家研究基金会;
关键词
knowledge distillation; speech representation learning; self-supervised learning; model compression;
D O I
10.21437/Interspeech.2022-11112
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Large-scale speech self-supervised learning (SSL) has emerged to the main field of speech processing, however, the problem of computational cost arising from its vast size makes a high entry barrier to academia. In addition, existing distillation techniques of speech SSL models compress the model by reducing layers, which induces performance degradation in linguistic pattern recognition tasks such as phoneme recognition (PR). In this paper, we propose FitHuBERT, which makes thinner in dimension throughout almost all model components and deeper in layer compared to prior speech SSL distillation works. Moreover, we employ a time-reduction layer to speed up inference time and propose a method of hint-based distillation for less performance degradation. Our method reduces the model to 23.8% in size and 35.9% in inference time compared to HuBERT. Also, we achieve 12.1% word error rate and 13.3% phoneme error rate on the SUPERB benchmark which is superior than prior work.
引用
收藏
页码:3588 / 3592
页数:5
相关论文
共 50 条
  • [1] SKILL: SIMILARITY-AWARE KNOWLEDGE DISTILLATION FOR SPEECH SELF-SUPERVISED LEARNING
    Zampierin, Luca
    Hacene, Ghouthi Boukli
    Nguyen, Bac
    Ravanelli, Mirco
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 675 - 679
  • [2] Self-supervised knowledge distillation in counterfactual learning for VQA
    Bi, Yandong
    Jiang, Huajie
    Zhang, Hanfu
    Hu, Yongli
    Yin, Baocai
    PATTERN RECOGNITION LETTERS, 2024, 177 : 33 - 39
  • [3] Self-supervised knowledge distillation for complementary label learning
    Liu, Jiabin
    Li, Biao
    Lei, Minglong
    Shi, Yong
    NEURAL NETWORKS, 2022, 155 : 318 - 327
  • [4] Distill on the Go: Online knowledge distillation in self-supervised learning
    Bhat, Prashant
    Arani, Elahe
    Zonooz, Bahram
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2672 - 2681
  • [5] On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation
    Yang, Gene-Ping
    Gu, Yue
    Tang, Qingming
    Du, Dongsu
    Liu, Yuzong
    INTERSPEECH 2023, 2023, : 1623 - 1627
  • [6] Self-Supervised Contrastive Learning for Camera-to-Radar Knowledge Distillation
    Wang, Wenpeng
    Campbell, Bradford
    Munir, Sirajum
    2024 20TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SMART SYSTEMS AND THE INTERNET OF THINGS, DCOSS-IOT 2024, 2024, : 154 - 161
  • [7] Image quality assessment based on self-supervised learning and knowledge distillation
    Sang, Qingbing
    Shu, Ziru
    Liu, Lixiong
    Hu, Cong
    Wu, Qin
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
  • [8] Hierarchical Self-supervised Augmented Knowledge Distillation
    Yang, Chuanguang
    An, Zhulin
    Cai, Linhang
    Xu, Yongjun
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 1217 - 1223
  • [9] DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
    Liu, Alexander H.
    Chang, Heng-Jui
    Auli, Michael
    Hsu, Wei-Ning
    Glass, James
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] COMEDIAN: Self-Supervised Learning and Knowledge Distillation for Action Spotting using Transformers
    Denize, Julien
    Liashuha, Mykola
    Rabarisoa, Jaonary
    Orcesi, Astrid
    Herault, Romain
    2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 518 - 528