AN ASYNCHRONOUS WFST-BASED DECODER FOR AUTOMATIC SPEECH RECOGNITION

被引:1
|
作者
Lv, Hang [1 ,2 ]
Chen, Zhehuai [2 ,5 ]
Xu, Hainan [2 ]
Povey, Daniel [4 ]
Xie, Lei [1 ]
Khudanpur, Sanjeev [2 ,3 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Audio Speech & Language Proc Lab ASLP NPU, Xian, Peoples R China
[2] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21205 USA
[3] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21205 USA
[4] Xiaomi Corp, Beijing, Peoples R China
[5] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, SpeechLab, Shanghai, Peoples R China
来源
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021) | 2021年
关键词
Automatic speech recognition; decoder; lattice generation; lattice pruning;
D O I
10.1109/ICASSP39728.2021.9414509
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We introduce asynchronous dynamic decoder, which adopts an efficient A* algorithm to incorporate big language models in the one-pass decoding for large vocabulary continuous speech recognition. Unlike standard one-pass decoding with on-the-fly composition decoder which might induce a significant computation overhead, the asynchronous dynamic decoder has a novel design where it has two fronts, with one performing "exploration" and the other "backfill". The computation of the two fronts alternates in the decoding process, resulting in more effective pruning than the standard one-pass decoding with an on-the-fly composition decoder. Experiments show that the proposed decoder works notably faster than the standard one-pass decoding with on-the-fly composition decoder, while the acceleration will be more obvious with the increment of data complexity.
引用
收藏
页码:6019 / 6023
页数:5
相关论文
共 50 条
  • [41] LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION BASED ON WFST STRUCTURED CLASSIFIERS AND DEEP BOTTLENECK FEATURES
    Kubo, Yotaro
    Hori, Takaaki
    Nakamura, Atsushi
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7629 - 7633
  • [42] Modified Viterbi Decoder for Hmm Based Speech Recognition System
    Kumar, Y. Rajeev
    Babu, A. Venkatesh
    Kumar, K. A. Naveen
    Alex, John Sahaya Rani
    2014 INTERNATIONAL CONFERENCE ON CONTROL, INSTRUMENTATION, COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICCICCT), 2014, : 470 - 474
  • [43] Regarding Topology and Variant Frame Rates for Differentiable WFST-based End-to-End ASR
    Zhao, Zeyu
    Bell, Peter
    INTERSPEECH 2023, 2023, : 4903 - 4907
  • [44] Transformer with Bidirectional Decoder for Speech Recognition
    Chen, Xi
    Zhang, Songyang
    Song, Dandan
    Ouyang, Peng
    Yin, Shouyi
    INTERSPEECH 2020, 2020, : 1773 - 1777
  • [45] A wave decoder for continuous speech recognition
    Burhke, E
    Chou, W
    Zhou, QR
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2135 - 2138
  • [46] A STACK DECODER FOR CONTINOUS SPEECH RECOGNITION
    STURTEVANT, DG
    SPEECH AND NATURAL LANGUAGE, 1989, : 193 - 198
  • [47] MOVIE AUDIO SCENE RECOGNITION BASED ON WFST
    Yang, Jichen
    Cai, Min
    Li, Yanxiong
    Jin, Hai
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2016, : 77 - 80
  • [48] Automatic Speech Recognition Based Odia System
    Karan, Biswajit
    Sahoo, Jayaprakash
    Sahu, P. K.
    2015 INTERNATIONAL CONFERENCE ON MICROWAVE, OPTICAL AND COMMUNICATION ENGINEERING (ICMOCE), 2015, : 353 - 356
  • [49] Automatic Speech Recognition Based on Electromyographic Biosignals
    Jou, Szu-Chen Stan
    Schultz, Tanja
    BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES, 2008, 25 : 305 - 320
  • [50] A Study on Detection Based Automatic Speech Recognition
    Ma, Chengyuan
    Tsao, Yu
    Lee, Chin-Hui
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2350 - 2353