DISTILLING KNOWLEDGE FROM ENSEMBLES OF ACOUSTIC MODELS FOR JOINT CTC-ATTENTION END-TO-END SPEECH RECOGNITION

被引：3

作者：

Gao, Yan ^{[1
]}

Parcollet, Titouan ^{[1
,2
]}

Lane, Nicholas D. ^{[1
,3
]}

机构：

[1] Univ Cambridge, Cambridge, England

[2] Avignon Univ, Avignon, France

[3] Samsung AI, Cambridge, England

来源：

2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU) | 2021年

关键词：

End-to-end speech recognition; attention models; CTC; multi-teacher knowledge distillation; DISTILLATION; LIBRISPEECH;

D O I：

10.1109/ASRU51503.2021.9688302

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Knowledge distillation has been widely used to compress existing deep learning models while preserving the performance on a wide range of applications. In the specific context of Automatic Speech Recognition (ASR), distillation from ensembles of acoustic models has recently shown promising results in increasing recognition performance. In this paper, we propose an extension of multi-teacher distillation methods to joint CTC-attention end-to-end ASR systems. We also introduce three novel distillation strategies. The core intuition behind them is to integrate the error rate metric to the teacher selection rather than solely focusing on the observed losses. In this way, we directly distill and optimize the student toward the relevant metric for speech recognition. We evaluate these strategies under a selection of training procedures on different datasets (TIMIT, Librispeech, Common Voice) and various languages (English, French, Italian). In particular, state-ofthe-art error rates are reported on the Common Voice French, Italian and TIMIT datasets.

引用

页码：138 / 145

页数：8

共 50 条

[1] STREAMING END-TO-END SPEECH RECOGNITION WITH JOINT CTC-ATTENTION BASED MODELS
Moritz, Niko
Hori, Takaaki
Le Roux, Jonathan
[J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 936 - 943
[2] Investigating Joint CTC-Attention Models for End-to-End Russian Speech Recognition
Markovnikov, Nikita
Kipyatkova, Irina
[J]. SPEECH AND COMPUTER, SPECOM 2019, 2019, 11658 : 337 - 347
[3] Joint CTC-Attention End-to-End Speech Recognition with a Triangle Recurrent Neural Network Encoder
Zhu T.
Cheng C.
[J]. Journal of Shanghai Jiaotong University (Science), 2020, 25 (01) : 70 - 75
[4] JOINT CTC-ATTENTION BASED END-TO-END SPEECH RECOGNITION USING MULTI-TASK LEARNING
Kim, Suyoun
Hori, Takaaki
Watanabe, Shinji
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4835 - 4839
[5] Joint CTC/attention decoding for end-to-end speech recognition
Hori, Takaaki
Watanabe, Shinji
Hershey, John R.
[J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 518 - 529
[6] Hybrid CTC-Attention based End-to-End Speech Recognition using Subword Units
Xiao, Zhangyu
Ou, Zhijian
Chu, Wei
Lin, Hui
[J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 146 - 150
[7] Improved CTC-Attention Based End-to-End Speech Recognition on Air Traffic Control
Zhou, Kai
Yang, Qun
Sun, XiuSong
Liu, ShaoHan
Lu, JinJun
[J]. INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: BIG DATA AND MACHINE LEARNING, PT II, 2019, 11936 : 187 - 196
[8] Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM
Hari, Takaaki
Watanabe, Shinji
Zhang, Yu
Chan, William
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 949 - 953
[9] IMPROVING HYBRID CTC/ATTENTION END-TO-END SPEECH RECOGNITION WITH PRETRAINED ACOUSTIC AND LANGUAGE MODELS
Deng, Keqi
Cao, Songjun
Zhang, Yike
Ma, Long
[J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 76 - 82
[10] Hybrid CTC-Attention Network-Based End-to-End Speech Recognition System for Korean Language
Park, Hosung
Kim, Changmin
Son, Hyunsoo
Seo, Soonshin
Kim, Ji-Hwan
[J]. JOURNAL OF WEB ENGINEERING, 2022, 21 (02): : 265 - 284

← 1 2 3 4 5 →