Improvement in N-best search for continuous speech recognition

被引:0
|
作者
Illina, I
Gong, YF
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, several techniques for reducing the search complexity of beam search for continuous speech recognition task are proposed. Six heuristic methods for pruning ate described and the parameters of the pruning are adjusted to keep constant the word error rate while reducing the computational complexity and memory demand. The evaluation of the effect of each pruning method is performed in Mixture Stochastic Trajectory Model (MSTM). MSTM is a segment-based model using phonemes as the speech units. The set of tests in a speaker-dependent continuous speech recognition task shows that using the pruning methods, a substantial reduction of 67% of search effort is obtained in term of number of hypothesised phonemes during the search. All proposed techniques are independent of the acoustic models and therefore are applicable to other acoustic modeling techniques.
引用
收藏
页码:2147 / 2150
页数:4
相关论文
共 50 条
  • [1] A word graph based N-Best search in continuous speech recognition
    Tran, BH
    Seide, F
    Steinbiss, V
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 2127 - 2130
  • [2] N-best vector quantization for isolated word speech recognition
    Nose, Masaya
    Maki, Shuichi
    Yartiane, Noburnoto
    Morikawa, Yoshitaka
    PROCEEDINGS OF SICE ANNUAL CONFERENCE, VOLS 1-8, 2007, : 2053 - +
  • [3] Results of the N-Best 2008 Dutch Speech Recognition Evaluation
    van Leeuwen, David A.
    Kessens, Judith
    Sanders, Eric
    van den Heuvel, Henk
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2531 - +
  • [4] Maximum relative margin estimation of HMMS based on N-best string models for continuous speech recognition
    Liu, CJ
    Jiang, H
    Rigazio, L
    2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 420 - 425
  • [5] The ESAT 2008 System for N-Best Dutch Speech Recognition Benchmark
    Demuynck, Kris
    Puurula, Antti
    Van Compernolle, Dirk
    Wambacq, Patrick
    2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, : 339 - 344
  • [6] An N-Best Candidates-Based Discriminative Training for Speech Recognition Applications
    Chen, Jung-Kuei
    Soong, Frank K.
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (01): : 206 - 216
  • [7] 3-D N-best search for simultaneous recognition of distant-talking speech of multiple talkers
    Nakamura, S
    Heracleous, P
    FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 59 - 63
  • [8] Correcting, Rescoring and Matching: An N-best List Selection Framework for Speech Recognition
    Kuo, Chin-Hung
    Chen, Kuan-Yu
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 729 - 734
  • [9] Automatic acoustic segmentation in N-best list rescoring for lecture speech recognition
    Shen, Peng
    Lu, Xugang
    Kawai, Hisashi
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [10] Semantic Features Based N-Best Rescoring Methods for Automatic Speech Recognition
    Liu, Chang
    Zhang, Pengyuan
    Li, Ta
    Yan, Yonghong
    APPLIED SCIENCES-BASEL, 2019, 9 (23):