Bayesian networks in multimodal speech recognition and speaker identification

被引:0
|
作者
Nefian, AV [1 ]
Liang, LH [1 ]
机构
[1] Intel Corp, Santa Clara, CA 95051 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Bayesian networks are statistical models that extend the framework of hidden Markov models (HMM) and allow for the analysis of multi modal signals such as audio-visual speech. Our recent results demonstrate the use of coupled HMM in audio-visual speech recognition and speaker identification. The increased performance of this model is due to its low complexity and its ability to describe both the audio-visual state asynchrony and natural dependency over time. The audio-visual speaker identification accuracy is enhanced in a late decision approach that integrates the audio-visual speech likelihood and the face likelihood computed using an embedded Bayesian network.
引用
收藏
页码:2004 / 2008
页数:5
相关论文
共 50 条
  • [21] Speaker identification using multimodal neural networks and wavelet analysis
    Almaadeed, Noor
    Aggoun, Amar
    Amira, Abbes
    [J]. IET BIOMETRICS, 2015, 4 (01) : 18 - 28
  • [22] Multimodal speaker/speech recognition using lip motion, lip texture and audio
    Cetingul, H. E.
    Erzin, E.
    Yemez, Y.
    Tekalp, A. M.
    [J]. SIGNAL PROCESSING, 2006, 86 (12) : 3549 - 3558
  • [23] Application of formant instantaneous characteristics to speech recognition and speaker identification
    侯丽敏
    胡晓宁
    谢娟敏
    [J]. Advances in Manufacturing, 2011, (02) : 123 - 127
  • [24] PARAMETRIC REPRESENTATION OF THE SPEAKER'S LIPS FOR MULTIMODAL SIGN LANGUAGE AND SPEECH RECOGNITION
    Ryumin, D.
    Karpov, A. A.
    [J]. INTERNATIONAL WORKSHOP PHOTOGRAMMETRIC AND COMPUTER VISION TECHNIQUES FOR VIDEO SURVEILLANCE, BIOMETRICS AND BIOMEDICINE, 2017, 42-2 (W4): : 155 - 161
  • [25] Self-learning speaker identification for enhanced speech recognition
    Herbig, Tobias
    Gerl, Franz
    Minker, Wolfgang
    [J]. COMPUTER SPEECH AND LANGUAGE, 2012, 26 (03): : 210 - 227
  • [26] Speaker independent speech recognition system based on phoneme identification
    Maheswari, N. Uma
    Kabilan, A. P.
    Venkatesh, R.
    [J]. ICCN: 2008 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING, 2008, : 585 - +
  • [27] Joint Speech Enhancement and Speaker Identification Using Approximate Bayesian Inference
    Maina, Ciira Wa
    Walsh, John MacLaren
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1517 - 1529
  • [28] Speaker Identification in Noisy Environments Using Dynamic Bayesian Networks
    Khanteymoori, A. R.
    Homayounpour, M. M.
    Menhaj, M. B.
    [J]. 2009 14TH INTERNATIONAL COMPUTER CONFERENCE, 2009, : 600 - +
  • [29] Bayesian Networks for Discrete Observation Distributions in Speech Recognition
    Miguel, Antonio
    Ortega, Alfonso
    Buera, Luis
    Lleida, Eduardo
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (06): : 1476 - 1489
  • [30] AN INTRODUCTION TO SPEECH AND SPEAKER RECOGNITION
    PEACOCKE, RD
    GRAF, DH
    [J]. COMPUTER, 1990, 23 (08) : 26 - 33