Efficient speech recognition using subvector quantization and discrete-mixture HMMS

被引：20

作者：

Digalakis, V ^{[1
]}

Tsakalidis, S

Harizakis, C

Neumeyer, L

机构：

[1] Tech Univ Crete, Dept Elect & Comp Engn, Hania 73100, Greece

[2] SRI Int, Menlo Park, CA 94025 USA

来源：

COMPUTER SPEECH AND LANGUAGE | 2000年 / 14卷 / 01期

关键词：

D O I：

10.1006/csla.1999.0134

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces a new form of observation distributions for hidden Markov models (HMMs), combining subvector quantization and mixtures of discrete distributions. Despite what is generally believed, we show that discrete-distribution HMMs can outperform continuous-density HMMs at significantly faster decoding speeds. Performance of the discrete HMMs is improved by using product-code vector quantization (VQ) and mixtures of discrete distributions. The decoding speed of the discrete HMMs is also improved by quantizing subvectors of coefficients, since this reduces the number of table lookups needed to compute the output probabilities. We present efficient training and decoding algorithms for the discrete-mixture HMMs (DMHMMs). Our experimental results in the air-travel information domain show that the high level of recognition accuracy of continuous-mixture-density HMMs (CDHMMs) can be maintained at significantly faster decoding speeds. Moreover, we show that when the same number of mixture components is used in DMHMMs and CDHMMs, the new models exhibit superior recognition performance. (C) 2000 Academic Press.

引用

页码：33 / 46

页数：14

共 50 条

[1] Efficient speech recognition using subvector quantization and discrete-mixture HMMs
Tsakalidis, S
Digalakis, V
Neumeyer, L
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 569 - 572
[2] Efficient speech recognition using subvector quantization and discrete-mixture HMMs
Tsakalidis, S.
Digalakis, V.
Neumeyer, L.
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 1999, 2 : 569 - 572
[3] Lecture speech recognition using discrete-mixture HMMs
Graduate School of Science and Engineering, Yamagata University, 4-3-16 Jonan, Yonezawa-shi, Yamagata 992-8510, Japan
IEEJ Trans. Electr. Electron. Eng., 1 (23-29):
[4] Lecture Speech Recognition Using Discrete-Mixture HMMs
Kosaka, Tetsuo
Yamamoto, Akiyoshi
Kumakura, Takuya
Kato, Masaharu
Kohda, Masaki
IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2011, 6 (01) : 23 - 29
[5] Robust speech recognition using discrete-mixture HMMs
Kosaka, T
Katoh, M
Kohda, M
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (12): : 2811 - 2818
[6] Noisy Speech Recognition by using Output Combination of Discrete-Mixture HMMs and Continuous-Mixture HMMs
Kosaka, Tetsuo
Saito, You
Kato, Masaharu
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2355 - 2358
[7] Histogram equalization for noise-robust speech recognition using discrete-mixture HMMs
Kosaka, Tetsuo
Katoh, Masaharu
Kohda, Masaki
ACOUSTICAL SCIENCE AND TECHNOLOGY, 2008, 29 (01) : 66 - 73
[8] Generalized mixture of HMMs for continuous speech recognition
Korkmazskiy, F
Juang, BH
Soong, F
1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1443 - 1446
[9] Boosted Mixture Learning of Gaussian Mixture HMMs for Speech Recognition
Du, Jun
Hu, Yu
Jiang, Hui
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2942 - +
[10] Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition
Tahir, Muhammad Ali
Nussbaum-Thom, Markus
Schlueter, Ralf
Ney, Hermann
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 570 - 573

← 1 2 3 4 5 →