A Fast Approximate Acoustic Match for Large Vocabulary Speech Recognition

被引:14
|
作者
Bahl, Lalit R. [1 ]
De Gennaro, Steven V. [1 ]
Gopalakrishnan, P. S. [1 ]
Mercer, Robert L. [1 ]
机构
[1] IBM Thomas J Watson Res Ctr, Speech Recognit Grp, Yorktown Hts, NY 10598 USA
来源
关键词
D O I
10.1109/89.221368
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In a large vocabulary speech recognition system using hidden Markov models, calculating the likelihood of an acoustic signal segment for all the words in the vocabulary involves a large amount of computation. In order to run in real time on a modest amount of hardware, it is important that these detailed acoustic likelihood computations be performed only on words which have a reasonable probability of being the word that was spoken. We describe a scheme for rapidly obtaining an approximate acoustic match for all the words in the vocabulary in such a way as to ensure that the correct word is, with high probability, one of a small number of words examined in detail. Using fast search methods we obtain a matching algorithm that is about a hundred times faster than doing a detailed acoustic likelihood computation on all the words in the IBM Office Correspondence isolated word dictation task which has a vocabulary of 20 000 words. We give experimental results showing the effectiveness of such a fast match for a number of talkers.
引用
收藏
页码:59 / 67
页数:9
相关论文
共 50 条
  • [1] A fast HMM match algorithm for very large vocabulary speech recognition
    Seward, A
    [J]. SPEECH COMMUNICATION, 2004, 42 (02) : 191 - 206
  • [2] A new verification-based fast-match for large vocabulary continuous speech recognition
    Afify, M
    Liu, F
    Jiang, H
    Siohan, O
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (04): : 546 - 553
  • [3] Boosting acoustic models in large vocabulary speech recognition
    Meyer, C
    Schramm, H
    [J]. PROCEEDINGS OF THE SIXTH IASTED INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING, 2004, : 255 - 260
  • [4] Building DNN acoustic models for large vocabulary speech recognition
    Maas, Andrew L.
    Qi, Peng
    Xie, Ziang
    Hannun, Awni Y.
    Lengerich, Christopher T.
    Jurafsky, Daniel
    Ng, Andrew Y.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2017, 41 : 195 - 213
  • [5] Boosting HMM acoustic models in large vocabulary speech recognition
    Meyer, C
    Schramm, H
    [J]. SPEECH COMMUNICATION, 2006, 48 (05) : 532 - 548
  • [6] A clustering algorithm for the fast match of acoustic conditions in continuous speech recognition
    Rodríguez, LJ
    Torres, MI
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 2, PROCEEDINGS, 2005, 3523 : 562 - 570
  • [7] Acoustic models of the elderly for large-vocabulary continuous speech recognition
    Baba, A
    Yoshizawa, S
    Yamada, M
    Lee, A
    Shikano, K
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 2004, 87 (07): : 49 - 57
  • [8] HYBRID ACOUSTIC MODELS FOR DISTANT AND MULTICHANNEL LARGE VOCABULARY SPEECH RECOGNITION
    Swietojanski, Pawel
    Ghoshal, Arnab
    Renals, Steve
    [J]. 2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 285 - 290
  • [9] Unsupervised training of acoustic models for large vocabulary continuous speech recognition
    Wessel, F
    Ney, H
    [J]. ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 307 - 310
  • [10] Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition
    Soltau, Hagen
    Liao, Hank
    Sak, Hasim
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3707 - 3711