INVESTIGATION OF SEQUENCE-LEVEL KNOWLEDGE DISTILLATION METHODS FOR CTC ACOUSTIC MODELS

被引：0

作者：

Takashima, Ryoichi ^{[1
,2
,3
]}

Sheng, Li ^{[1
]}

Kawai, Hisashi ^{[1
]}

机构：

[1] Natl Inst Informat & Commun Technol NICT, Koganei, Tokyo, Japan

[2] NICT, Koganei, Tokyo, Japan

[3] Hitachi Ltd, Tokyo, Japan

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年

关键词：

Speech recognition; acoustic model; connectionist temporal classification; knowledge distillation;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents knowledge distillation ( KD) methods for training connectionist temporal classification ( CTC) acoustic models. In a previous study, we proposed a KD method based on the sequence-level cross-entropy, and showed that the conventional KD method based on the frame-level cross-entropy did not work effectively for CTC acoustic models, whereas the proposed method improved the performance of the models. In this paper, we investigate the implementation of sequence-level KD for CTC models and propose a lattice-based sequence-level KD method. Experiments investigating model compression and the training of a noise-robust model using the Wall Street Journal ( WSJ) and CHiME4 datasets demonstrate that the sequence-level KD methods improve the performance of CTC acoustic models on both two tasks, and show that the lattice-based method can compute the sequence-level KD more efficiently than the N-best-based method proposed in our previous work.

引用

页码：6156 / 6160

页数：5

共 26 条

[21] EFFICIENT BUILDING STRATEGY WITH KNOWLEDGE DISTILLATION FOR SMALL-FOOTPRINT ACOUSTIC MODELS
Moriya, Takafumi
Kanagawa, Hiroki
Matsui, Kiyoaki
Fukutomi, Takaaki
Shinohara, Yusuke
Yamaguchi, Yoshikazu
Okamoto, Manabu
Aono, Yushi
2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 21 - 28
[22] DISTILLING KNOWLEDGE FROM ENSEMBLES OF ACOUSTIC MODELS FOR JOINT CTC-ATTENTION END-TO-END SPEECH RECOGNITION
Gao, Yan
Parcollet, Titouan
Lane, Nicholas D.
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 138 - 145
[23] Compression of CTC-Trained Acoustic Models by Dynamic Frame-Wise Distillation Or Segment-Wise N-Best Hypotheses Imitation
Ding, Haisong
Chen, Kai
Huo, Qiang
INTERSPEECH 2019, 2019, : 3218 - 3222
[24] An Investigation of Factored Regression Missing Data Methods for Multilevel Models with Cross-Level Interactions
Keller, Brian T. T.
Enders, Craig K. K.
MULTIVARIATE BEHAVIORAL RESEARCH, 2023, 58 (05) : 938 - 963
[25] Investigation of signal models and methods for evaluating structures of processing telecommunication information exchange systems under acoustic noise conditions
Kropotov, Y. A.
Belov, A. A.
Proskuryakov, A. Y.
Kolpakov, A. A.
INTERNATIONAL CONFERENCE INFORMATION TECHNOLOGIES IN BUSINESS AND INDUSTRY 2018, PTS 1-4, 2018, 1015
[26] Investigation of hearing aid fitting according to the national acoustic laboratories' prescription for non-linear hearing aids and the desired sensation level methods in Japanese speakers: a crossover-controlled trial
Furuki, Shogo
Sano, Hajime
Kurioka, Takaomi
Nitta, Yosihiro
Umehara, Sachie
Hara, Yuki
Yamashita, Taku
AURIS NASUS LARYNX, 2023, 50 (05) : 708 - 713

← 1 2 3 →