COMPARISON OF TEXT-INDEPENDENT SPEAKER RECOGNITION METHODS USING VECTOR-QUANTIZATION DISTORTION AND DISCRETE AND CONTINUOUS HMMS

被引:0
|
作者
MATSUI, T
FURUI, S
机构
[1] NTT Human Interface Laboratories, Musashino
关键词
SPEAKER RECOGNITION; TEXT-INDEPENDENT; VECTOR QUANTIZATION; ERGODIC HMM; UTTERANCE VARIATION;
D O I
10.1002/ecjc.4430771207
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The results of speaker recognition methods using vector quantization (VQ) distortion and discrete or continuous ergodic hidden Markov models (HMMs) are compared. The effectiveness of these methods is examined from the viewpoint of robustness against utterance variation such as differences in content, temporal variation, and changes in utterance speed. It is shown that the continuous HMM performs much better than the discrete HMM and its performance is close to that of the VQ distortion method. When the amount of training data is limited, however, the VQ distortion method achieves a better recognition rate than the continuous HMM. The transition information between the states is shown to contribute little to identifying the individual characteristics of a voice. An increase in the number of states or in the number of mixture components in a state both have an equal effect, and recognition performance is almost completely determined by the product of these two numbers.
引用
收藏
页码:63 / 70
页数:8
相关论文
共 50 条
  • [21] Codebook design using DCT coder for text-independent speaker recognition
    Lung, SY
    Proceedings of the Sixth IASTED International Conference on Signal and Image Processing, 2004, : 261 - 263
  • [22] Comparison of Text-Independent Original Speaker Recognition from Emotionally Converted Speech
    Pribil, Jiri
    Pribilova, Anna
    RECENT ADVANCES IN NONLINEAR SPEECH PROCESSING, 2016, 48 : 137 - 149
  • [23] TEXT-INDEPENDENT SPEAKER RECOGNITION USING TWO-DIMENSIONAL INFORMATION ENTROPY
    Bozilovic, Bosko
    Todorovic, Branislav M.
    Obradovic, Miroslav
    JOURNAL OF ELECTRICAL ENGINEERING-ELEKTROTECHNICKY CASOPIS, 2015, 66 (03): : 169 - 173
  • [24] Text-independent speaker recognition using LSTM-RNN and speech enhancement
    Abd El-Moneim, Samia
    Nassar, M. A.
    Dessouky, Moawad I.
    Ismail, Nabil A.
    El-Fishawy, Adel S.
    Abd El-Samie, Fathi E.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (33-34) : 24013 - 24028
  • [25] Text-independent speaker recognition using LSTM-RNN and speech enhancement
    Samia Abd El-Moneim
    M. A. Nassar
    Moawad I. Dessouky
    Nabil A. Ismail
    Adel S. El-Fishawy
    Fathi E. Abd El-Samie
    Multimedia Tools and Applications, 2020, 79 : 24013 - 24028
  • [26] TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING NEURAL NETS AND AR-VECTOR MODELS
    HADJITODOROV, S
    BOYANOV, B
    IVANOV, T
    DALAKCHIEVA, N
    ELECTRONICS LETTERS, 1994, 30 (11) : 838 - 840
  • [27] Text-independent speaker recognition using non-linear frame likelihood transformation
    Markov, KP
    Nakagawa, S
    SPEECH COMMUNICATION, 1998, 24 (03) : 193 - 209
  • [28] An integrated system for text-independent speaker recognition using binary neural network classifiers
    Hou, FL
    Wang, BX
    2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 710 - 713
  • [29] Text-Independent Speaker Recognition for Ambient Intelligence Applications by Using Information Set Features
    Anand, Abhinav
    Labati, Ruggero Donida
    Hanmandlu, Madasu
    Piuri, Vincenzo
    Scotti, Fabio
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND VIRTUAL ENVIRONMENTS FOR MEASUREMENT SYSTEMS AND APPLICATIONS (CIVEMSA), 2017, : 30 - 35
  • [30] A case study for the application of text-independent forensic speaker recognition using Bayesian interpretation
    Isik, Yusuf Ziya
    Kanak, Alper
    Bicil, Yuecel
    Dogan, Mehmet Ugur
    2007 IEEE 15TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS, VOLS 1-3, 2007, : 794 - 797