INVESTIGATION OF SEQUENCE-LEVEL KNOWLEDGE DISTILLATION METHODS FOR CTC ACOUSTIC MODELS

被引:0
|
作者
Takashima, Ryoichi [1 ,2 ,3 ]
Sheng, Li [1 ]
Kawai, Hisashi [1 ]
机构
[1] Natl Inst Informat & Commun Technol NICT, Koganei, Tokyo, Japan
[2] NICT, Koganei, Tokyo, Japan
[3] Hitachi Ltd, Tokyo, Japan
关键词
Speech recognition; acoustic model; connectionist temporal classification; knowledge distillation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents knowledge distillation ( KD) methods for training connectionist temporal classification ( CTC) acoustic models. In a previous study, we proposed a KD method based on the sequence-level cross-entropy, and showed that the conventional KD method based on the frame-level cross-entropy did not work effectively for CTC acoustic models, whereas the proposed method improved the performance of the models. In this paper, we investigate the implementation of sequence-level KD for CTC models and propose a lattice-based sequence-level KD method. Experiments investigating model compression and the training of a noise-robust model using the Wall Street Journal ( WSJ) and CHiME4 datasets demonstrate that the sequence-level KD methods improve the performance of CTC acoustic models on both two tasks, and show that the lattice-based method can compute the sequence-level KD more efficiently than the N-best-based method proposed in our previous work.
引用
收藏
页码:6156 / 6160
页数:5
相关论文
共 26 条
  • [21] EFFICIENT BUILDING STRATEGY WITH KNOWLEDGE DISTILLATION FOR SMALL-FOOTPRINT ACOUSTIC MODELS
    Moriya, Takafumi
    Kanagawa, Hiroki
    Matsui, Kiyoaki
    Fukutomi, Takaaki
    Shinohara, Yusuke
    Yamaguchi, Yoshikazu
    Okamoto, Manabu
    Aono, Yushi
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 21 - 28
  • [22] DISTILLING KNOWLEDGE FROM ENSEMBLES OF ACOUSTIC MODELS FOR JOINT CTC-ATTENTION END-TO-END SPEECH RECOGNITION
    Gao, Yan
    Parcollet, Titouan
    Lane, Nicholas D.
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 138 - 145
  • [23] Compression of CTC-Trained Acoustic Models by Dynamic Frame-Wise Distillation Or Segment-Wise N-Best Hypotheses Imitation
    Ding, Haisong
    Chen, Kai
    Huo, Qiang
    INTERSPEECH 2019, 2019, : 3218 - 3222
  • [24] An Investigation of Factored Regression Missing Data Methods for Multilevel Models with Cross-Level Interactions
    Keller, Brian T. T.
    Enders, Craig K. K.
    MULTIVARIATE BEHAVIORAL RESEARCH, 2023, 58 (05) : 938 - 963
  • [25] Investigation of signal models and methods for evaluating structures of processing telecommunication information exchange systems under acoustic noise conditions
    Kropotov, Y. A.
    Belov, A. A.
    Proskuryakov, A. Y.
    Kolpakov, A. A.
    INTERNATIONAL CONFERENCE INFORMATION TECHNOLOGIES IN BUSINESS AND INDUSTRY 2018, PTS 1-4, 2018, 1015
  • [26] Investigation of hearing aid fitting according to the national acoustic laboratories' prescription for non-linear hearing aids and the desired sensation level methods in Japanese speakers: a crossover-controlled trial
    Furuki, Shogo
    Sano, Hajime
    Kurioka, Takaomi
    Nitta, Yosihiro
    Umehara, Sachie
    Hara, Yuki
    Yamashita, Taku
    AURIS NASUS LARYNX, 2023, 50 (05) : 708 - 713