Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition

被引：8

作者：

Hanilci, Cemal ^{[1
]}

Ertas, Figen ^{[1
]}

机构：

[1] Uludag Univ, Dept Elect Engn, Bursa, Turkey

来源：

COMPUTERS & ELECTRICAL ENGINEERING | 2011年 / 37卷 / 01期

关键词：

IDENTIFICATION; ALGORITHM;

D O I：

10.1016/j.compeleceng.2010.08.001

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper evaluates the impact of three special forms of the Minkowski metric (Euclidean, City Block, and Chebychev distances) on the performance of the conventional vector quantization (VQ) and Gaussian mixture model (GMM) based closed-set text-independent speaker recognition systems, in terms of recognition rate and confidence on decisions. For the VQ based system, evaluations are carried out using the two most common clustering algorithms, LBG and K-means, and it is revealed which clustering algorithm and distance pair should be used to exploit the best attribute of both to achieve the best recognition rate for a given codebook size. In the case of GMM based system, we introduce the metrics into the GMM using a concatenation of the LBG and K-means algorithms in estimating the initial mean vectors, to which the system performance is sensitive, and explore their impact on system performance. We also make comparison of results obtained from evaluations on clean speech (TIMIT) and telephone speech databases (NTIMIT and NIST2001) with the modern classifiers VQ-UBM and GMM-UBM. It is found that there are cases where conventional VQ based system outperforms the modern systems. Moreover, the impact of distance metrics on the performance of the conventional and modern systems depends on the recognition task imposed (verification/identification). Crown Copyright (C) 2010 Published by Elsevier Ltd. All rights reserved.

引用

页码：41 / 56

页数：16

共 50 条

[31] Gender-based speaker recognition from speech signals using GMM model
Gupta, Manish
Bhartit, Shambhu Shankar
Agarwal, Suneeta
MODERN PHYSICS LETTERS B, 2019, 33 (35):
[32] COMPARISON OF I-VECTOR AND GMM-UBM SPEAKER RECOGNITION ON A ROMANIAN LARGE SPEECH CORPUS
Georgescu, Alexandru-Lucian
Cucu, Horia
Burileanu, Corneliu
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE LINGUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE, 2018, : 25 - 32
[33] Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM
Wang, Longbiao
Kitaoka, Norihide
Nakagawa, Selichi
SPEECH COMMUNICATION, 2007, 49 (06) : 501 - 513
[34] Gammachirp Filter Banks Applied in Roust Speaker Recognition Based on GMM-UBM Classifier
Deng, Lei
Gao, Yong
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2020, 17 (02) : 170 - 177
[35] Design of an Automatic Speaker Recognition System Based on Adapted MFCC and GMM Methods for Arabic Speech
Tazi, El Bachir
Benabbou, Abderrahim
Harti, Mostafa
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2010, 10 (01): : 45 - 50
[36] Learning Polynomial Function Based Neutral-Emotion GMM Transformation for Emotional Speaker Recognition
Shan, Zhenyu
Yang, Yingchun
19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 1822 - 1825
[37] Text-independent/text-prompted speaker recognition by combining speaker-specific GMM with speaker adapted syllable-based HMM
Nakagawa, S
Zhang, W
Takahashi, M
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (03): : 1058 - 1065
[38] Comparison of Text-Independent Speaker Recognition Methods Using VQ-Distortion and Discrete/Continuous HMM's
Matsui, Tomoko
Furui, Sadaoki
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (03): : 456 - 459
[39] Noise Robust Speaker Recognition Based on Adaptive Frame Weighting in GMM for i-Vector Extraction
Zhang, Xingyu
Zou, Xia
Sun, Meng
Zheng, Thomas Fang
Jia, Chong
Wang, Yimin
IEEE ACCESS, 2019, 7 : 27874 - 27882
[40] Comparison of Speaker Adaptation Methods as Feature Extraction for SVM-Based Speaker Recognition
Ferras, Marc
Leung, Cheung-Chi
Barras, Claude
Gauvain, Jean-Luc
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1366 - 1378

← 1 2 3 4 5 →