Feature pruning in likelihood evaluation of HMM-based speech recognition

被引：1

作者：

Li, X ^{[1
]}

Bilmes, J ^{[1
]}

机构：

[1] Univ Washington, Dept Elect Engn, Seattle, WA 98195 USA

来源：

ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03 | 2003年

关键词：

D O I：

10.1109/ASRU.2003.1318458

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, we present a simple yet effective technique to reduce the likelihood computation in ASR systems that use continuous density HMMs. In a variety of speech recognition tasks, likelihood evaluation accounts for a significant portion of the total computational load. Our proposed method, under certain conditions, only evaluates the component likelihoods of certain features; and approximates those of the remaining (pruned) features by prediction. We investigate two feature clustering approaches associated with our pruning technique. While a simple sequential clustering works remarkably well, a data-driven approach performs even better in its attempt to save computation while maintaining baseline recognition accuracy. With the second approach, we can speed up the likelihood evaluation by 33% and reduce its power consumption by 27% for an isolated word recognition task. For a continuous speech recognition system using either monophone or triphone models, the speedup and power reduction of the likelihood evaluation are 50% and 35% respectively.

引用

页码：303 / 308

页数：6

共 50 条

[21] Scalable architecture for word HMM-based speech recognition
Yoshizawa, S
Wada, N
Hayasaka, N
Miyanaga, Y
[J]. 2004 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 3, PROCEEDINGS, 2004, : 417 - 420
[22] Discriminative feature weighting for HMM-based continuous speech recognizers
de la Torre, A
Peinado, AM
Rubio, AJ
Segura, JC
Benítez, C
[J]. SPEECH COMMUNICATION, 2002, 38 (3-4) : 267 - 286
[23] Pitch dependent phone modelling for HMM-based speech recognition
Singer, H.
Sagayama, S.
[J]. Journal of the Acoustical Society of Japan (E) (English translation of Nippon Onkyo Gakkaishi), 1994, 15 (02):
[24] From Stochastic Speech Recognition to Understanding: An HMM-Based approach
Boda, PP
[J]. 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 57 - 64
[25] Use of acoustic contextual information in HMM-based speech recognition
Korea Advanced Inst of Science and, Technology, Taejon, Korea, Republic of
[J]. IEEE Signal Process Lett, 5 (108-110):
[26] HMM-Based Speaker Emotional Recognition Technology for Speech Signal
Qin, Yuqiang
Zhang, Xueying
[J]. FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY, PTS 1-3, 2011, 230-232 : 261 - 265
[27] Incorporating the voicing information into HMM-based automatic speech recognition
Jancovic, Peter
Koekueer, Muenevver
[J]. 2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 42 - 46
[28] A maximum model distance approach for HMM-based speech recognition
Kwong, S
He, QH
Man, KF
Tang, KS
[J]. PATTERN RECOGNITION, 1998, 31 (03) : 219 - 229
[29] Evaluation of the Slovenian HMM-based speech synthesis system
Vesnicer, B
Mihelic, F
[J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 513 - 520
[30] The use of acoustic contextual information in HMM-Based speech recognition
Choi, IJ
Lee, SY
[J]. IEEE SIGNAL PROCESSING LETTERS, 1998, 5 (05) : 108 - 110

← 1 2 3 4 5 →