Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition

被引:8
|
作者
Hanilci, Cemal [1 ]
Ertas, Figen [1 ]
机构
[1] Uludag Univ, Dept Elect Engn, Bursa, Turkey
关键词
IDENTIFICATION; ALGORITHM;
D O I
10.1016/j.compeleceng.2010.08.001
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper evaluates the impact of three special forms of the Minkowski metric (Euclidean, City Block, and Chebychev distances) on the performance of the conventional vector quantization (VQ) and Gaussian mixture model (GMM) based closed-set text-independent speaker recognition systems, in terms of recognition rate and confidence on decisions. For the VQ based system, evaluations are carried out using the two most common clustering algorithms, LBG and K-means, and it is revealed which clustering algorithm and distance pair should be used to exploit the best attribute of both to achieve the best recognition rate for a given codebook size. In the case of GMM based system, we introduce the metrics into the GMM using a concatenation of the LBG and K-means algorithms in estimating the initial mean vectors, to which the system performance is sensitive, and explore their impact on system performance. We also make comparison of results obtained from evaluations on clean speech (TIMIT) and telephone speech databases (NTIMIT and NIST2001) with the modern classifiers VQ-UBM and GMM-UBM. It is found that there are cases where conventional VQ based system outperforms the modern systems. Moreover, the impact of distance metrics on the performance of the conventional and modern systems depends on the recognition task imposed (verification/identification). Crown Copyright (C) 2010 Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:41 / 56
页数:16
相关论文
共 50 条
  • [31] Gender-based speaker recognition from speech signals using GMM model
    Gupta, Manish
    Bhartit, Shambhu Shankar
    Agarwal, Suneeta
    MODERN PHYSICS LETTERS B, 2019, 33 (35):
  • [32] COMPARISON OF I-VECTOR AND GMM-UBM SPEAKER RECOGNITION ON A ROMANIAN LARGE SPEECH CORPUS
    Georgescu, Alexandru-Lucian
    Cucu, Horia
    Burileanu, Corneliu
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE LINGUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE, 2018, : 25 - 32
  • [33] Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM
    Wang, Longbiao
    Kitaoka, Norihide
    Nakagawa, Selichi
    SPEECH COMMUNICATION, 2007, 49 (06) : 501 - 513
  • [34] Gammachirp Filter Banks Applied in Roust Speaker Recognition Based on GMM-UBM Classifier
    Deng, Lei
    Gao, Yong
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2020, 17 (02) : 170 - 177
  • [35] Design of an Automatic Speaker Recognition System Based on Adapted MFCC and GMM Methods for Arabic Speech
    Tazi, El Bachir
    Benabbou, Abderrahim
    Harti, Mostafa
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2010, 10 (01): : 45 - 50
  • [36] Learning Polynomial Function Based Neutral-Emotion GMM Transformation for Emotional Speaker Recognition
    Shan, Zhenyu
    Yang, Yingchun
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 1822 - 1825
  • [37] Text-independent/text-prompted speaker recognition by combining speaker-specific GMM with speaker adapted syllable-based HMM
    Nakagawa, S
    Zhang, W
    Takahashi, M
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (03): : 1058 - 1065
  • [38] Comparison of Text-Independent Speaker Recognition Methods Using VQ-Distortion and Discrete/Continuous HMM's
    Matsui, Tomoko
    Furui, Sadaoki
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (03): : 456 - 459
  • [39] Noise Robust Speaker Recognition Based on Adaptive Frame Weighting in GMM for i-Vector Extraction
    Zhang, Xingyu
    Zou, Xia
    Sun, Meng
    Zheng, Thomas Fang
    Jia, Chong
    Wang, Yimin
    IEEE ACCESS, 2019, 7 : 27874 - 27882
  • [40] Comparison of Speaker Adaptation Methods as Feature Extraction for SVM-Based Speaker Recognition
    Ferras, Marc
    Leung, Cheung-Chi
    Barras, Claude
    Gauvain, Jean-Luc
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1366 - 1378