Gaussian mixture language models for speech recognition

被引:0
|
作者
Afify, Mohamed [1 ]
Siohan, Olivier [1 ]
Sarikaya, Ruhi [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, 1101 Old Kitchawan Rd, Yorktown Hts, NY 10598 USA
关键词
language model; N-gram; Gaussian mixture model; continuous space;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose a Gaussian mixture language model for speech recognition. Two potential benefits of using this model are smoothing unseen events, and ease of adaptation. It is shown how this model can be used alone or in conjunction with a a conventional N-gram model to calculate word probabilities. An interesting feature of the proposed technique is that many methods developed for acoustic models can be easily ported to GMLM. We developed two implementations of the proposed model for large vocabulary Arabic speech recognition with results comparable to conventional N-gram.
引用
收藏
页码:29 / +
页数:2
相关论文
共 50 条
  • [31] Skew Gaussian mixture models for speaker recognition
    Matza, Avi
    Bistritz, Yuval
    [J]. IET SIGNAL PROCESSING, 2014, 8 (08) : 860 - 867
  • [32] Cross-lingual subspace Gaussian mixture models for low-resource speech recognition
    [J]. 1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
  • [33] Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition
    Lu, Liang
    Ghoshal, Arnab
    Renals, Steve
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (01) : 17 - 27
  • [34] DEEP NEURAL NETWORKS WITH AUXILIARY GAUSSIAN MIXTURE MODELS FOR REAL-TIME SPEECH RECOGNITION
    Lei, Xin
    Lin, Hui
    Heigold, Georg
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7634 - 7638
  • [35] Cross-lingual subspace Gaussian mixture models for low-resource speech recognition
    [J]. 1600, Institute of Electrical and Electronics Engineers Inc., United States (22):
  • [36] MAXIMUM A POSTERIORI ADAPTATION OF SUBSPACE GAUSSIAN MIXTURE MODELS FOR CROSS-LINGUAL SPEECH RECOGNITION
    Lu, Liang
    Ghoshal, Arnab
    Renals, Steve
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4877 - 4880
  • [37] Sparse Gaussian Graphical Models for Speech Recognition
    Bell, Peter
    King, Simon
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1545 - 1548
  • [38] Domain Adaptation Based on Mixture of Latent Words Language Models for Automatic Speech Recognition
    Masumura, Ryo
    Asami, Taichi
    Oba, Takanobu
    Masataki, Hirokazu
    Sakauchi, Sumitaka
    Ito, Akinori
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (06): : 1581 - 1590
  • [39] Speech Enhancement Using Gaussian Scale Mixture Models
    Hao, Jiucang
    Lee, Te-Won
    Sejnowski, Terrence J.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1127 - 1136
  • [40] Emotional speech classification using Gaussian mixture models
    Ververidis, D
    Kotropoulos, C
    [J]. 2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 2871 - 2874