Gaussian mixture language models for speech recognition

被引:0
|
作者
Afify, Mohamed [1 ]
Siohan, Olivier [1 ]
Sarikaya, Ruhi [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, 1101 Old Kitchawan Rd, Yorktown Hts, NY 10598 USA
关键词
language model; N-gram; Gaussian mixture model; continuous space;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a Gaussian mixture language model for speech recognition. Two potential benefits of using this model are smoothing unseen events, and ease of adaptation. It is shown how this model can be used alone or in conjunction with a a conventional N-gram model to calculate word probabilities. An interesting feature of the proposed technique is that many methods developed for acoustic models can be easily ported to GMLM. We developed two implementations of the proposed model for large vocabulary Arabic speech recognition with results comparable to conventional N-gram.
引用
收藏
页码:29 / +
页数:2
相关论文
共 50 条
  • [1] SUBSPACE GAUSSIAN MIXTURE MODELS FOR SPEECH RECOGNITION
    Povey, Daniel
    Burget, Lukas
    Agarwal, Mohit
    Akyazi, Pinar
    Feng, Kai
    Ghoshal, Arnab
    Glembek, Ondrej
    Goel, Nagendra Kumar
    Karafiat, Martin
    Rastrow, Ariya
    Rose, Richard C.
    Schwarz, Petr
    Thomas, Samuel
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4330 - 4333
  • [2] Complementary Gaussian Mixture Models for Multimodal Speech Recognition
    Sad, Gonzalo D.
    Terissi, Lucas D.
    Gomez, Juan C.
    [J]. MULTIMODAL PATTERN RECOGNITION OF SOCIAL SIGNALS IN HUMAN-COMPUTER-INTERACTION, 2015, 8869 : 54 - 65
  • [3] Gaussian mixture models of phonetic boundaries for speech recognition
    Omar, MK
    Hasegawa-Johnson, M
    Levinson, S
    [J]. ASRU 2001: IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, CONFERENCE PROCEEDINGS, 2001, : 33 - 36
  • [4] Subspace constrained Gaussian mixture models for speech recognition
    Axelrod, S
    Goel, V
    Gopinath, RA
    Olsen, PA
    Visweswariah, K
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (06): : 1144 - 1160
  • [5] Variational Gaussian Mixture Models for Speech Emotion Recognition
    Mishra, Harendra Kumar
    Sekhar, C. Chandra
    [J]. ICAPR 2009: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, PROCEEDINGS, 2009, : 183 - 186
  • [6] Regularized Subspace Gaussian Mixture Models for Speech Recognition
    Lu, Liang
    Ghoshal, Arnab
    Renals, Steve
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2011, 18 (07) : 419 - 422
  • [7] Comparison of Subspace Methods for Gaussian Mixture Models in Speech Recognition
    Varjokallio, Matti
    Kurimo, Mikko
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 181 - 184
  • [8] Recognition of Emotions in German Speech Using Gaussian Mixture Models
    Vondra, Martin
    Vich, Robert
    [J]. MULTIMODAL SIGNAL: COGNITIVE AND ALGORITHMIC ISSUES, 2009, 5398 : 256 - 263
  • [9] Gaussian mixture clustering and language adaptation for the development of a new language speech recognition system
    Chatzichrisafis, Nikos
    Diakoloukas, Vassilios
    Digalakis, Vassilios
    Harizakis, Costas
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03): : 928 - 938
  • [10] Efficient Gaussian mixture for speech recognition
    Zouari, Leila
    Chollet, Gerard
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS, 2006, : 294 - +