INVESTIGATION OF SEQUENCE-LEVEL KNOWLEDGE DISTILLATION METHODS FOR CTC ACOUSTIC MODELS

被引:0
|
作者
Takashima, Ryoichi [1 ,2 ,3 ]
Sheng, Li [1 ]
Kawai, Hisashi [1 ]
机构
[1] Natl Inst Informat & Commun Technol NICT, Koganei, Tokyo, Japan
[2] NICT, Koganei, Tokyo, Japan
[3] Hitachi Ltd, Tokyo, Japan
关键词
Speech recognition; acoustic model; connectionist temporal classification; knowledge distillation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents knowledge distillation ( KD) methods for training connectionist temporal classification ( CTC) acoustic models. In a previous study, we proposed a KD method based on the sequence-level cross-entropy, and showed that the conventional KD method based on the frame-level cross-entropy did not work effectively for CTC acoustic models, whereas the proposed method improved the performance of the models. In this paper, we investigate the implementation of sequence-level KD for CTC models and propose a lattice-based sequence-level KD method. Experiments investigating model compression and the training of a noise-robust model using the Wall Street Journal ( WSJ) and CHiME4 datasets demonstrate that the sequence-level KD methods improve the performance of CTC acoustic models on both two tasks, and show that the lattice-based method can compute the sequence-level KD more efficiently than the N-best-based method proposed in our previous work.
引用
收藏
页码:6156 / 6160
页数:5
相关论文
共 26 条
  • [11] Token-level and sequence-level loss smoothing for RNN language models
    Elbayad, Maha
    Besacier, Laurent
    Verbeek, Jakob
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 2094 - 2103
  • [12] Self-Improvement of Non-autoregressive Model via Sequence-Level Distillation
    Liao, Yusheng
    Jiang, Shuyang
    Li, Yiqi
    Wang, Yanfeng
    Wang, Yu
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14202 - 14212
  • [13] Improving Knowledge Distillation of CTC-Trained Acoustic Models With Alignment-Consistent Ensemble and Target Delay
    Ding, Haisong
    Chen, Kai
    Huo, Qiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (2561-2571) : 2561 - 2571
  • [14] Factorized and progressive knowledge distillation for CTC-based ASR models
    Tian, Sanli
    Li, Zehan
    Lyv, Zhaobiao
    Cheng, Gaofeng
    Xiao, Qing
    Li, Ta
    Zhao, Qingwei
    SPEECH COMMUNICATION, 2024, 160
  • [15] Sequence-level models for distortion-rate behaviour of compressed video
    Choi, LU
    Ivrlac, MT
    Steinbach, E
    Nossek, JA
    2005 International Conference on Image Processing (ICIP), Vols 1-5, 2005, : 1713 - 1716
  • [16] Oracle Teacher: Leveraging Target Information for Better Knowledge Distillation of CTC Models
    Yoon, Ji Won
    Kim, Hyung Yong
    Lee, Hyeonseung
    Ahn, Sunghwan
    Kim, Nam Soo
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2974 - 2987
  • [17] Knowledge Distillation For CTC-based Speech Recognition Via Consistent Acoustic Representation Learning
    Tian, Sanli
    Deng, Keqi
    Li, Zehan
    Ye, Lingxuan
    Cheng, Gaofeng
    Li, Ta
    Yan, Yonghong
    INTERSPEECH 2022, 2022, : 2633 - 2637
  • [18] DOMAIN ADAPTATION OF DNN ACOUSTIC MODELS USING KNOWLEDGE DISTILLATION
    Asami, Taichi
    Masumura, Ryo
    Yamaguchi, Yoshikazu
    Masataki, Hirokazu
    Aono, Yushi
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5185 - 5189
  • [19] Knowledge distillation via instance-level sequence learning
    Zhao, Haoran
    Sun, Xin
    Dong, Junyu
    Dong, Zihe
    Li, Qiong
    KNOWLEDGE-BASED SYSTEMS, 2021, 233
  • [20] Structure-Level Knowledge Distillation For Multilingual Sequence Labeling
    Wang, Xinyu
    Jiang, Yong
    Bach, Nguyen
    Wang, Tao
    Huang, Fei
    Tu, Kewei
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3317 - 3330