Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition

被引:8
|
作者
Hanilci, Cemal [1 ]
Ertas, Figen [1 ]
机构
[1] Uludag Univ, Dept Elect Engn, Bursa, Turkey
关键词
IDENTIFICATION; ALGORITHM;
D O I
10.1016/j.compeleceng.2010.08.001
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper evaluates the impact of three special forms of the Minkowski metric (Euclidean, City Block, and Chebychev distances) on the performance of the conventional vector quantization (VQ) and Gaussian mixture model (GMM) based closed-set text-independent speaker recognition systems, in terms of recognition rate and confidence on decisions. For the VQ based system, evaluations are carried out using the two most common clustering algorithms, LBG and K-means, and it is revealed which clustering algorithm and distance pair should be used to exploit the best attribute of both to achieve the best recognition rate for a given codebook size. In the case of GMM based system, we introduce the metrics into the GMM using a concatenation of the LBG and K-means algorithms in estimating the initial mean vectors, to which the system performance is sensitive, and explore their impact on system performance. We also make comparison of results obtained from evaluations on clean speech (TIMIT) and telephone speech databases (NTIMIT and NIST2001) with the modern classifiers VQ-UBM and GMM-UBM. It is found that there are cases where conventional VQ based system outperforms the modern systems. Moreover, the impact of distance metrics on the performance of the conventional and modern systems depends on the recognition task imposed (verification/identification). Crown Copyright (C) 2010 Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:41 / 56
页数:16
相关论文
共 50 条
  • [2] Speaker recognition using mfcc and hybrid model of VQ and GMM
    Desai, Dhruv
    Joshi, Maulin
    1600, Springer Verlag (235): : 53 - 63
  • [3] A speaker recognition system based on VQ
    Zhao Yanling
    Zheng Xiaoshi
    Gao Huixian
    Li Na
    ICIEA 2008: 3RD IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, PROCEEDINGS, VOLS 1-3, 2008, : 1988 - 1990
  • [4] Speaker Cluster based GMM Tokenization for Speaker Recognition
    Ma, Bin
    Zhu, Donglai
    Tong, Rong
    Li, Haizhou
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 505 - 508
  • [5] Speaker Recognition and Speech Emotion Recognition Based on GMM
    Xu, Shupeng
    Liu, Yan
    Liu, Xiping
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON ELECTRIC AND ELECTRONICS, 2013, : 434 - 436
  • [6] Secondary classification for GMM based speaker recognition
    Pelecanos, Jason
    Povey, Dan
    Ramaswamy, Ganesh
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 109 - 112
  • [7] Speaker recognition based on the combination of GMM and SVDD
    Zhou, Yuhuan
    Zhang, Xiongwei
    Wang, Jinming
    Gong, Yong
    Zhou, Yi
    PRZEGLAD ELEKTROTECHNICZNY, 2011, 87 (03): : 329 - 332
  • [8] Robust Speaker Identification Based On Hybrid Model of VQ and GMM-UBM
    Nguyen, Vu X.
    Nguyen, Vu P. H.
    Pham, Tuan V.
    2015 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR COMMUNICATIONS (ATC), 2015, : 490 - 495
  • [9] Speaker Recognition Based on GMM with an Embedded TDNN
    Chen, Cunbao
    Zhao, Li
    NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2009, 5864 : 746 - 753
  • [10] Closed-set speaker identification using VQ and GMM based models
    Bidhan Barai
    Tapas Chakraborty
    Nibaran Das
    Subhadip Basu
    Mita Nasipuri
    International Journal of Speech Technology, 2022, 25 : 173 - 196