Bayesian networks in multimodal speech recognition and speaker identification

被引：0

作者：

Nefian, AV ^{[1
]}

Liang, LH ^{[1
]}

机构：

[1] Intel Corp, Santa Clara, CA 95051 USA

来源：

CONFERENCE RECORD OF THE THIRTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2 | 2003年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Bayesian networks are statistical models that extend the framework of hidden Markov models (HMM) and allow for the analysis of multi modal signals such as audio-visual speech. Our recent results demonstrate the use of coupled HMM in audio-visual speech recognition and speaker identification. The increased performance of this model is due to its low complexity and its ability to describe both the audio-visual state asynchrony and natural dependency over time. The audio-visual speaker identification accuracy is enhanced in a late decision approach that integrates the audio-visual speech likelihood and the face likelihood computed using an embedded Bayesian network.

引用

页码：2004 / 2008

页数：5

共 50 条

[21] Speaker identification using multimodal neural networks and wavelet analysis
Almaadeed, Noor
Aggoun, Amar
Amira, Abbes
[J]. IET BIOMETRICS, 2015, 4 (01) : 18 - 28
[22] Multimodal speaker/speech recognition using lip motion, lip texture and audio
Cetingul, H. E.
Erzin, E.
Yemez, Y.
Tekalp, A. M.
[J]. SIGNAL PROCESSING, 2006, 86 (12) : 3549 - 3558
[23] Application of formant instantaneous characteristics to speech recognition and speaker identification
侯丽敏
胡晓宁
谢娟敏
[J]. Advances in Manufacturing, 2011, (02) : 123 - 127
[24] PARAMETRIC REPRESENTATION OF THE SPEAKER'S LIPS FOR MULTIMODAL SIGN LANGUAGE AND SPEECH RECOGNITION
Ryumin, D.
Karpov, A. A.
[J]. INTERNATIONAL WORKSHOP PHOTOGRAMMETRIC AND COMPUTER VISION TECHNIQUES FOR VIDEO SURVEILLANCE, BIOMETRICS AND BIOMEDICINE, 2017, 42-2 (W4): : 155 - 161
[25] Self-learning speaker identification for enhanced speech recognition
Herbig, Tobias
Gerl, Franz
Minker, Wolfgang
[J]. COMPUTER SPEECH AND LANGUAGE, 2012, 26 (03): : 210 - 227
[26] Speaker independent speech recognition system based on phoneme identification
Maheswari, N. Uma
Kabilan, A. P.
Venkatesh, R.
[J]. ICCN: 2008 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING, 2008, : 585 - +
[27] Joint Speech Enhancement and Speaker Identification Using Approximate Bayesian Inference
Maina, Ciira Wa
Walsh, John MacLaren
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1517 - 1529
[28] Speaker Identification in Noisy Environments Using Dynamic Bayesian Networks
Khanteymoori, A. R.
Homayounpour, M. M.
Menhaj, M. B.
[J]. 2009 14TH INTERNATIONAL COMPUTER CONFERENCE, 2009, : 600 - +
[29] Bayesian Networks for Discrete Observation Distributions in Speech Recognition
Miguel, Antonio
Ortega, Alfonso
Buera, Luis
Lleida, Eduardo
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1476 - 1489
[30] AN INTRODUCTION TO SPEECH AND SPEAKER RECOGNITION
PEACOCKE, RD
GRAF, DH
[J]. COMPUTER, 1990, 23 (08) : 26 - 33

← 1 2 3 4 5 →