Improvement in N-best search for continuous speech recognition

被引:0
|
作者
Illina, I
Gong, YF
机构
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, several techniques for reducing the search complexity of beam search for continuous speech recognition task are proposed. Six heuristic methods for pruning ate described and the parameters of the pruning are adjusted to keep constant the word error rate while reducing the computational complexity and memory demand. The evaluation of the effect of each pruning method is performed in Mixture Stochastic Trajectory Model (MSTM). MSTM is a segment-based model using phonemes as the speech units. The set of tests in a speaker-dependent continuous speech recognition task shows that using the pruning methods, a substantial reduction of 67% of search effort is obtained in term of number of hypothesised phonemes during the search. All proposed techniques are independent of the acoustic models and therefore are applicable to other acoustic modeling techniques.
引用
收藏
页码:2147 / 2150
页数:4
相关论文
共 50 条
  • [21] N-best rescoring for speech recognition using penalized logistic regression machines with garbage class
    Birkenes, Oystein
    Matsui, Tomoko
    Tanabe, Kunio
    Myrvoll, Tor Andre
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 449 - +
  • [22] N-best: The Northern- and Southern-Dutch Benchmark Evaluation of Speech recognition Technology
    Kessens, Judith
    van Leeuwen, David
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1173 - 1176
  • [23] Simultaneous recognition of distant-talking speech of multiple sound sources based on 3-D N-best search algorithm
    Heracleous, P
    Nakamura, S
    Shikano, K
    ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 111 - 114
  • [24] Rescoring of N-Best Hypotheses Using Top-Down Selective Attention for Automatic Speech Recognition
    Kim, Ho-Gyeong
    Lee, Hwaran
    Kim, Geonmin
    Oh, Sang-Hoon
    Lee, Soo-Young
    IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (02) : 199 - 203
  • [25] Multimodal N-best List Rescoring with Weakly Supervised Pre-training in Hybrid Speech Recognition
    Song, Yuanfeng
    Huang, Xiaoling
    Zhao, Xuefang
    Jiang, Di
    Wong, Raymond Chi-Wing
    2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021), 2021, : 1336 - 1341
  • [26] A discriminative training framework using N-best speech recognition transcriptions and scores for spoken utterance classification
    Yaman, Sibel
    Deng, Li
    Yu, Dong
    Wang, Ye-Yi
    Acero, Alex
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 5 - +
  • [27] Improved speech recognition using acoustic and lexical correlates of pitch accent in a N-best rescoring framework
    Ananthakrishnan, Sankaranarayanan
    Narayanan, Shrikanth
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 873 - +
  • [28] N-best speech hypothesis reordering based on comprehensive information theory
    Liu, JY
    Zhong, YX
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 29 - 32
  • [29] Morpho-syntactic post-processing of N-best lists for improved French automatic speech recognition
    Huet, Stephane
    Gravier, Guillaume
    Sebillot, Pascale
    COMPUTER SPEECH AND LANGUAGE, 2010, 24 (04): : 663 - 684
  • [30] Empirically combining unnormalized NNLM and back-off N-gram for fast N-best rescoring in speech recognition
    Shi, Yongzhe
    Zhang, Wei-Qiang
    Cai, Meng
    Liu, Jia
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,