Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition

被引：8

作者：

Hanilci, Cemal ^{[1
]}

Ertas, Figen ^{[1
]}

机构：

[1] Uludag Univ, Dept Elect Engn, Bursa, Turkey

来源：

COMPUTERS & ELECTRICAL ENGINEERING | 2011年 / 37卷 / 01期

关键词：

IDENTIFICATION; ALGORITHM;

D O I：

10.1016/j.compeleceng.2010.08.001

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper evaluates the impact of three special forms of the Minkowski metric (Euclidean, City Block, and Chebychev distances) on the performance of the conventional vector quantization (VQ) and Gaussian mixture model (GMM) based closed-set text-independent speaker recognition systems, in terms of recognition rate and confidence on decisions. For the VQ based system, evaluations are carried out using the two most common clustering algorithms, LBG and K-means, and it is revealed which clustering algorithm and distance pair should be used to exploit the best attribute of both to achieve the best recognition rate for a given codebook size. In the case of GMM based system, we introduce the metrics into the GMM using a concatenation of the LBG and K-means algorithms in estimating the initial mean vectors, to which the system performance is sensitive, and explore their impact on system performance. We also make comparison of results obtained from evaluations on clean speech (TIMIT) and telephone speech databases (NTIMIT and NIST2001) with the modern classifiers VQ-UBM and GMM-UBM. It is found that there are cases where conventional VQ based system outperforms the modern systems. Moreover, the impact of distance metrics on the performance of the conventional and modern systems depends on the recognition task imposed (verification/identification). Crown Copyright (C) 2010 Published by Elsevier Ltd. All rights reserved.

引用

页码：41 / 56

页数：16

共 50 条

[1] Speaker recognition using MFCC and hybrid model of VQ and GMM
1600, Springer Verlag (235):
[2] Speaker recognition using mfcc and hybrid model of VQ and GMM
Desai, Dhruv
Joshi, Maulin
1600, Springer Verlag (235): : 53 - 63
[3] A speaker recognition system based on VQ
Zhao Yanling
Zheng Xiaoshi
Gao Huixian
Li Na
ICIEA 2008: 3RD IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, PROCEEDINGS, VOLS 1-3, 2008, : 1988 - 1990
[4] Speaker Cluster based GMM Tokenization for Speaker Recognition
Ma, Bin
Zhu, Donglai
Tong, Rong
Li, Haizhou
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 505 - 508
[5] Speaker Recognition and Speech Emotion Recognition Based on GMM
Xu, Shupeng
Liu, Yan
Liu, Xiping
PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON ELECTRIC AND ELECTRONICS, 2013, : 434 - 436
[6] Secondary classification for GMM based speaker recognition
Pelecanos, Jason
Povey, Dan
Ramaswamy, Ganesh
2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 109 - 112
[7] Speaker recognition based on the combination of GMM and SVDD
Zhou, Yuhuan
Zhang, Xiongwei
Wang, Jinming
Gong, Yong
Zhou, Yi
PRZEGLAD ELEKTROTECHNICZNY, 2011, 87 (03): : 329 - 332
[8] Robust Speaker Identification Based On Hybrid Model of VQ and GMM-UBM
Nguyen, Vu X.
Nguyen, Vu P. H.
Pham, Tuan V.
2015 INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR COMMUNICATIONS (ATC), 2015, : 490 - 495
[9] Speaker Recognition Based on GMM with an Embedded TDNN
Chen, Cunbao
Zhao, Li
NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2009, 5864 : 746 - 753
[10] Closed-set speaker identification using VQ and GMM based models
Bidhan Barai
Tapas Chakraborty
Nibaran Das
Subhadip Basu
Mita Nasipuri
International Journal of Speech Technology, 2022, 25 : 173 - 196

← 1 2 3 4 5 →