Improvement in N-best search for continuous speech recognition

被引:0
|
作者
Illina, I
Gong, YF
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, several techniques for reducing the search complexity of beam search for continuous speech recognition task are proposed. Six heuristic methods for pruning ate described and the parameters of the pruning are adjusted to keep constant the word error rate while reducing the computational complexity and memory demand. The evaluation of the effect of each pruning method is performed in Mixture Stochastic Trajectory Model (MSTM). MSTM is a segment-based model using phonemes as the speech units. The set of tests in a speaker-dependent continuous speech recognition task shows that using the pruning methods, a substantial reduction of 67% of search effort is obtained in term of number of hypothesised phonemes during the search. All proposed techniques are independent of the acoustic models and therefore are applicable to other acoustic modeling techniques.
引用
收藏
页码:2147 / 2150
页数:4
相关论文
共 50 条
  • [41] Morphosyntactic Processing of N-Best Lists for Improved Recognition and Confidence Measure Computation
    Huet, Stephane
    Gravier, Guillaume
    Sebillot, Pascale
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1989 - 1992
  • [42] ASR N-BEST FUSION NETS
    Liu, Xinyue
    Li, Mingda
    Chen, Luoxin
    Wanigasekara, Prashan
    Ruan, Weitong
    Khan, Haidar
    Hamza, Wael
    Su, Chengwei
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7618 - 7622
  • [43] Improving N-Best Rescoring in Under-Resourced Code-Switched Speech Recognition Using Pretraining and Data Augmentation
    van Vuren, Joshua Jansen
    Niesler, Thomas
    LANGUAGES, 2022, 7 (03)
  • [44] Parsing N-best lists of handwritten sentences
    Zimmermann, M
    Chappelier, JC
    Bunke, H
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 572 - 576
  • [45] USING DEEP-Q NETWORK TO SELECT CANDIDATES FROM N-BEST SPEECH RECOGNITION HYPOTHESES FOR ENHANCING DIALOGUE STATE TRACKING
    Tsai, Richard Tzong-Han
    Chen, Chia-Hao
    Wu, Chun-Kai
    Hsiao, Yu-Cheng
    Lee, Hung-yi
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7375 - 7379
  • [46] N-best List Re-ranking Using Syntactic Score: A Solution for Improving Speech Recognition Accuracy in Air Traffic Control
    Van Nhan Nguyen
    Holone, Harald
    2016 16TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2016, : 1309 - 1314
  • [47] DATA DRIVEN SEARCH ORGANIZATION FOR CONTINUOUS SPEECH RECOGNITION
    NEY, H
    MERGEL, D
    NOLL, A
    PAESELER, A
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1992, 40 (02) : 272 - 281
  • [48] RESCORING N-BEST SPEECH RECOGNITION LIST BASED ON ONE-ON-ONE HYPOTHESIS COMPARISON USING ENCODER-CLASSIFIER MODEL
    Ogawa, Atsunori
    Delcroix, Marc
    Karita, Shigeki
    Nakatani, Tomohiro
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6099 - 6103
  • [49] A minimax search algorithm for robust continuous speech recognition
    Jiang, H
    Hirose, K
    Huo, Q
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (06): : 688 - 694
  • [50] Search organization in the whisper continuous speech recognition system
    Alleva, F
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 295 - 302