Speech recognition with dynamic Bayesian networks

被引：0

作者：

Zweig, G ^{[1
]}

Russell, S ^{[1
]}

机构：

[1] Univ Calif Berkeley, Dept Comp Sci, Berkeley, CA 94720 USA

来源：

FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS | 1998年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dynamic Bayesian networks (DBNs) are a useful tool for representing complex stochastic processes. Recent developments in inference and learning in DBNs allow their use in real-world applications. In this paper, we apply DBNs to the problem of speech recognition. The factored state representation enabled by DBNs allows us to explicitly represent long-term articulatory and acoustic context in addition to the phonetic-state information maintained by hidden Markov models (HMMs). Furthermore, it enables us to model the short-term correlations among multiple observation streams within single time-frames. Given a DBN structure capable of representing these long- and short-term correlations, we applied the EM algorithm to learn models with up to 500,000 parameters. The use of structured DBN models decreased the error rate by 12 to 29% on a large-vocabulary isolated-word recognition task, compared to a discrete HMM; it also improved significantly on other published results for the same task. This is the first successful application of DBNs to a large-scale speech recognition problem. Investigation of the learned models indicates that the hidden state variables are strongly correlated with acoustic properties of the speech signal.

引用

页码：173 / 180

页数：8

共 50 条

[1] Dynamic Bayesian networks for automatic speech recognition
Deviren, M
[J]. EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 981 - 981
[2] Speech recognition using dynamic programming of Bayesian neural networks
Huang, CC
Wang, JF
Wu, CH
Lee, JY
[J]. CENTRAL AUDITORY PROCESSING AND NEURAL MODELING, 1998, : 71 - 76
[3] Dynamic Bayesian networks for audio-visual speech recognition
Nefian, AV
Liang, LH
Pi, XB
Liu, XX
Murphy, K
[J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2002, 2002 (11) : 1274 - 1288
[4] Dynamic Bayesian Networks for Audio-Visual Speech Recognition
Ara V. Nefian
Luhong Liang
Xiaobo Pi
Xiaoxing Liu
Kevin Murphy
[J]. EURASIP Journal on Advances in Signal Processing, 2002
[5] Dynamic Bayesian networks for multi-band automatic speech recognition
Daoudi, K
Fohr, D
Antoine, C
[J]. COMPUTER SPEECH AND LANGUAGE, 2003, 17 (2-3): : 263 - 285
[6] Switching auxiliary chains for speech recognition based on dynamic Bayesian networks
Lin, Hui
Ou, Zhijian
[J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 258 - +
[7] Continuous speech recognition using dynamic Bayesian networks: A fast decoding algorithm
Deviren, M
Daoudi, K
[J]. ADVANCES IN BAYESIAN NETWORKS, 2004, 146 : 289 - 308
[8] DYNAMIC NEURAL NETWORKS FOR SPEECH RECOGNITION
OLAFSSON, S
[J]. BT TECHNOLOGY JOURNAL, 1992, 10 (03): : 48 - 58
[9] Dynamic Bayesian networks for visual recognition of dynamic gestures
Avilés-Arriaga, HH
Sucar, LE
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2002, 12 (3-4) : 243 - 250
[10] Dynamic Bayesian network inversion for robust speech recognition
Xie, Lei
Yang, Hongwu
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2007, E90D (07): : 1117 - 1120

← 1 2 3 4 5 →