Speech recognition using probabilistic and statistical models

被引:0
|
作者
Singh, Amber [1 ]
Anand, R. S. [1 ]
机构
[1] Indian Inst Technol, Dept Elect Engn, Roorkee, Uttar Pradesh, India
关键词
Automatic speech recognition (ASR); Mel frequency cepstral coefficients (MFCCs); EM algorithm; Hidden markov model; Gaussian mixture model; Vector quantization; Gaussian mixture model-Universal background model;
D O I
10.1109/CICN.2015.141
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an implementation of probabilistic and statistical models for speech recognition. Three models namely Gaussian mixture model, hidden markov model and Gaussian mixture model - universal background model are discussed. In GMM, both speech identification of unknown isolated words and classification of unknown test patterns are discussed. In HMM, speech identification of isolated words are discussed. In GMM-UBM, speech identification of isolated words and speech classification of unknown test patterns are discussed. Isolated word recognizer build using all the three models for the recognition of isolated words can give 100% accuracy depending upon the initialization of the models. GMM-UBM is not found suitable for the classification of unknown test patterns.
引用
收藏
页码:686 / 690
页数:5
相关论文
共 50 条
  • [1] Robust speech recognition using probabilistic union models
    Ming, J
    Jancovic, P
    Smith, FJ
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (06): : 403 - 414
  • [2] Speaker adaptation techniques for speech recognition using probabilistic models
    Shinoda, K
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 2005, 88 (12): : 25 - 42
  • [3] Augmented statistical models for speech recognition
    Layton, M. I.
    Gales, M. J. F.
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 129 - 132
  • [4] Speech recognition experiments using multi-span statistical language models
    Bellegarda, JR
    [J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 717 - 720
  • [5] STATISTICAL-MODELS OF TEXT IN CONTINUOUS SPEECH RECOGNITION
    CHEN, YS
    [J]. KYBERNETES, 1991, 20 (05) : 29 - 40
  • [6] Speech denoising and dereverberation using probabilistic models
    Attias, H
    Platt, JC
    Acero, A
    Deng, L
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13, 2001, 13 : 758 - 764
  • [7] Large vocabulary speech recognition with multispan statistical language models
    Bellegarda, JR
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (01): : 76 - 84
  • [8] Statistical Transformation of Language and Pronunciation Models for Spontaneous Speech Recognition
    Akita, Yuya
    Kawahara, Tatsuya
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1539 - 1549
  • [9] Automatic Estimation of Scaling Factors Among Probabilistic Models in Speech Recognition
    Emori, Tadashi
    Onishi, Yoshifumi
    Shinoda, Koichi
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1229 - +
  • [10] Speech recognition using linear dynamic models
    Frankel, Joe
    King, Simon
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (01): : 246 - 256