INVESTIGATION OF SEQUENCE-LEVEL KNOWLEDGE DISTILLATION METHODS FOR CTC ACOUSTIC MODELS

被引：0

作者：

Takashima, Ryoichi ^{[1
,2
,3
]}

Sheng, Li ^{[1
]}

Kawai, Hisashi ^{[1
]}

机构：

[1] Natl Inst Informat & Commun Technol NICT, Koganei, Tokyo, Japan

[2] NICT, Koganei, Tokyo, Japan

[3] Hitachi Ltd, Tokyo, Japan

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年

关键词：

Speech recognition; acoustic model; connectionist temporal classification; knowledge distillation;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents knowledge distillation ( KD) methods for training connectionist temporal classification ( CTC) acoustic models. In a previous study, we proposed a KD method based on the sequence-level cross-entropy, and showed that the conventional KD method based on the frame-level cross-entropy did not work effectively for CTC acoustic models, whereas the proposed method improved the performance of the models. In this paper, we investigate the implementation of sequence-level KD for CTC models and propose a lattice-based sequence-level KD method. Experiments investigating model compression and the training of a noise-robust model using the Wall Street Journal ( WSJ) and CHiME4 datasets demonstrate that the sequence-level KD methods improve the performance of CTC acoustic models on both two tasks, and show that the lattice-based method can compute the sequence-level KD more efficiently than the N-best-based method proposed in our previous work.

引用

页码：6156 / 6160

页数：5

共 26 条

[11] Token-level and sequence-level loss smoothing for RNN language models
Elbayad, Maha
Besacier, Laurent
Verbeek, Jakob
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 2094 - 2103
[12] Self-Improvement of Non-autoregressive Model via Sequence-Level Distillation
Liao, Yusheng
Jiang, Shuyang
Li, Yiqi
Wang, Yanfeng
Wang, Yu
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 14202 - 14212
[13] Improving Knowledge Distillation of CTC-Trained Acoustic Models With Alignment-Consistent Ensemble and Target Delay
Ding, Haisong
Chen, Kai
Huo, Qiang
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (2561-2571) : 2561 - 2571
[14] Factorized and progressive knowledge distillation for CTC-based ASR models
Tian, Sanli
Li, Zehan
Lyv, Zhaobiao
Cheng, Gaofeng
Xiao, Qing
Li, Ta
Zhao, Qingwei
SPEECH COMMUNICATION, 2024, 160
[15] Sequence-level models for distortion-rate behaviour of compressed video
Choi, LU
Ivrlac, MT
Steinbach, E
Nossek, JA
2005 International Conference on Image Processing (ICIP), Vols 1-5, 2005, : 1713 - 1716
[16] Oracle Teacher: Leveraging Target Information for Better Knowledge Distillation of CTC Models
Yoon, Ji Won
Kim, Hyung Yong
Lee, Hyeonseung
Ahn, Sunghwan
Kim, Nam Soo
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2974 - 2987
[17] Knowledge Distillation For CTC-based Speech Recognition Via Consistent Acoustic Representation Learning
Tian, Sanli
Deng, Keqi
Li, Zehan
Ye, Lingxuan
Cheng, Gaofeng
Li, Ta
Yan, Yonghong
INTERSPEECH 2022, 2022, : 2633 - 2637
[18] DOMAIN ADAPTATION OF DNN ACOUSTIC MODELS USING KNOWLEDGE DISTILLATION
Asami, Taichi
Masumura, Ryo
Yamaguchi, Yoshikazu
Masataki, Hirokazu
Aono, Yushi
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5185 - 5189
[19] Knowledge distillation via instance-level sequence learning
Zhao, Haoran
Sun, Xin
Dong, Junyu
Dong, Zihe
Li, Qiong
KNOWLEDGE-BASED SYSTEMS, 2021, 233
[20] Structure-Level Knowledge Distillation For Multilingual Sequence Labeling
Wang, Xinyu
Jiang, Yong
Bach, Nguyen
Wang, Tao
Huang, Fei
Tu, Kewei
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3317 - 3330

← 1 2 3 →