AN ASYNCHRONOUS WFST-BASED DECODER FOR AUTOMATIC SPEECH RECOGNITION

被引：1

作者：

Lv, Hang ^{[1
,2
]}

Chen, Zhehuai ^{[2
,5
]}

Xu, Hainan ^{[2
]}

Povey, Daniel ^{[4
]}

Xie, Lei ^{[1
]}

Khudanpur, Sanjeev ^{[2
,3
]}

机构：

[1] Northwestern Polytech Univ, Sch Comp Sci, Audio Speech & Language Proc Lab ASLP NPU, Xian, Peoples R China

[2] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21205 USA

[3] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21205 USA

[4] Xiaomi Corp, Beijing, Peoples R China

[5] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, SpeechLab, Shanghai, Peoples R China

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年

关键词：

Automatic speech recognition; decoder; lattice generation; lattice pruning;

D O I：

10.1109/ICASSP39728.2021.9414509

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We introduce asynchronous dynamic decoder, which adopts an efficient A* algorithm to incorporate big language models in the one-pass decoding for large vocabulary continuous speech recognition. Unlike standard one-pass decoding with on-the-fly composition decoder which might induce a significant computation overhead, the asynchronous dynamic decoder has a novel design where it has two fronts, with one performing "exploration" and the other "backfill". The computation of the two fronts alternates in the decoding process, resulting in more effective pruning than the standard one-pass decoding with an on-the-fly composition decoder. Experiments show that the proposed decoder works notably faster than the standard one-pass decoding with on-the-fly composition decoder, while the acceleration will be more obvious with the increment of data complexity.

引用

页码：6019 / 6023

页数：5

共 50 条

[21] Evaluation of a WFST-based ASR system for train timetable information
Department of Computer Science, Tokyo Institute of Technology, 152-8552 Tokyo, Japan
不详
APSIPA ASC - Asia-Pac. Signal Inf. Process. Assoc. Annu. Summit Conf., (648-651):
[22] STATISTICAL DIALOG MANAGEMENT APPLIED TO WFST-BASED DIALOG SYSTEMS
Hori, Chiori
Ohtake, Kiyonori
Misu, Teruhisa
Kashioka, Hideki
Nakamura, Satoshi
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4793 - 4796
[23] Automatic Speech Recognition Transformer with Global Contextual Information Decoder
Qian, Yukun
Zhuang, Xuyi
Wang, Mingjiang
INTERSPEECH 2023, 2023, : 4474 - 4478
[24] SIMULTANEOUS SPEECH RECOGNITION AND ACOUSTIC EVENT DETECTION USING AN LSTM-CTC ACOUSTIC MODEL AND A WFST DECODER
Fujimura, Hiroshi
Nagao, Manabu
Masuko, Takashi
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5834 - 5838
[25] A WFST-based Log-linear Framework for Speaking-style Transformation
Neubig, Graham
Mori, Shinsuke
Kawahara, Tatsuya
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1503 - 1506
[26] Expansion of WFST-Based Dialog Management for Handling Multiple ASR Hypotheses
Kimura, Naoto
Hori, Chiori
Misu, Teruhisa
Ohtake, Kiyonori
Kawai, Hisashi
Nakamura, Satoshi
SPOKEN DIALOGUE SYSTEMS FOR AMBIENT ENVIRONMENTS, 2010, 6392 : 61 - 72
[27] Iterative Grapheme-to-Phoneme Alignment for the Training of WFST-based Phonetic Conversion
Bohac, Marek
Malek, Jiri
Blavka, Karel
2013 36TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2013, : 474 - 478
[28] Asynchronous integration of visual information in an automatic speech recognition system
Alissali, M
Deleglise, P
Rogozan, A
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 34 - 37
[29] The titech large vocabulary WFST speech recognition system
Dixon, Paul R.
Caseiro, Diamantino A.
Oonishi, Tasuku
Furui, Sadaoki
2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 443 - +
[30] Language model adaptation using WFST-based speaking-style translation
Hori, Takaaki
Willett, Daniel
Minami, Yasuhiro
ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, (228-231):

← 1 2 3 4 5 →