AN ASYNCHRONOUS WFST-BASED DECODER FOR AUTOMATIC SPEECH RECOGNITION

被引:1
|
作者
Lv, Hang [1 ,2 ]
Chen, Zhehuai [2 ,5 ]
Xu, Hainan [2 ]
Povey, Daniel [4 ]
Xie, Lei [1 ]
Khudanpur, Sanjeev [2 ,3 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Audio Speech & Language Proc Lab ASLP NPU, Xian, Peoples R China
[2] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21205 USA
[3] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21205 USA
[4] Xiaomi Corp, Beijing, Peoples R China
[5] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, SpeechLab, Shanghai, Peoples R China
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
Automatic speech recognition; decoder; lattice generation; lattice pruning;
D O I
10.1109/ICASSP39728.2021.9414509
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We introduce asynchronous dynamic decoder, which adopts an efficient A* algorithm to incorporate big language models in the one-pass decoding for large vocabulary continuous speech recognition. Unlike standard one-pass decoding with on-the-fly composition decoder which might induce a significant computation overhead, the asynchronous dynamic decoder has a novel design where it has two fronts, with one performing "exploration" and the other "backfill". The computation of the two fronts alternates in the decoding process, resulting in more effective pruning than the standard one-pass decoding with an on-the-fly composition decoder. Experiments show that the proposed decoder works notably faster than the standard one-pass decoding with on-the-fly composition decoder, while the acceleration will be more obvious with the increment of data complexity.
引用
收藏
页码:6019 / 6023
页数:5
相关论文
共 50 条
  • [21] Evaluation of a WFST-based ASR system for train timetable information
    Department of Computer Science, Tokyo Institute of Technology, 152-8552 Tokyo, Japan
    不详
    APSIPA ASC - Asia-Pac. Signal Inf. Process. Assoc. Annu. Summit Conf., (648-651):
  • [22] STATISTICAL DIALOG MANAGEMENT APPLIED TO WFST-BASED DIALOG SYSTEMS
    Hori, Chiori
    Ohtake, Kiyonori
    Misu, Teruhisa
    Kashioka, Hideki
    Nakamura, Satoshi
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4793 - 4796
  • [23] Automatic Speech Recognition Transformer with Global Contextual Information Decoder
    Qian, Yukun
    Zhuang, Xuyi
    Wang, Mingjiang
    INTERSPEECH 2023, 2023, : 4474 - 4478
  • [24] SIMULTANEOUS SPEECH RECOGNITION AND ACOUSTIC EVENT DETECTION USING AN LSTM-CTC ACOUSTIC MODEL AND A WFST DECODER
    Fujimura, Hiroshi
    Nagao, Manabu
    Masuko, Takashi
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5834 - 5838
  • [25] A WFST-based Log-linear Framework for Speaking-style Transformation
    Neubig, Graham
    Mori, Shinsuke
    Kawahara, Tatsuya
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1503 - 1506
  • [26] Expansion of WFST-Based Dialog Management for Handling Multiple ASR Hypotheses
    Kimura, Naoto
    Hori, Chiori
    Misu, Teruhisa
    Ohtake, Kiyonori
    Kawai, Hisashi
    Nakamura, Satoshi
    SPOKEN DIALOGUE SYSTEMS FOR AMBIENT ENVIRONMENTS, 2010, 6392 : 61 - 72
  • [27] Iterative Grapheme-to-Phoneme Alignment for the Training of WFST-based Phonetic Conversion
    Bohac, Marek
    Malek, Jiri
    Blavka, Karel
    2013 36TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2013, : 474 - 478
  • [28] Asynchronous integration of visual information in an automatic speech recognition system
    Alissali, M
    Deleglise, P
    Rogozan, A
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 34 - 37
  • [29] The titech large vocabulary WFST speech recognition system
    Dixon, Paul R.
    Caseiro, Diamantino A.
    Oonishi, Tasuku
    Furui, Sadaoki
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 443 - +
  • [30] Language model adaptation using WFST-based speaking-style translation
    Hori, Takaaki
    Willett, Daniel
    Minami, Yasuhiro
    ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, (228-231):