Speaker recognition using pyramid match kernel based support vector machines

被引:0
|
作者
A. D. Dileep
C. Chandra Sekhar
机构
[1] Indian Institute of Technology Madras,Department of Computer Science and Engineering
关键词
Speaker identification; Speaker verification; Kernel methods; Support vector machines; Dynamic kernel; Pyramid match kernel;
D O I
10.1007/s10772-012-9154-4
中图分类号
学科分类号
摘要
Gaussian mixture model (GMM) based approaches have been commonly used for speaker recognition tasks. Methods for estimation of parameters of GMMs include the expectation-maximization method which is a non-discriminative learning based method. Discriminative classifier based approaches to speaker recognition include support vector machine (SVM) based classifiers using dynamic kernels such as generalized linear discriminant sequence kernel, probabilistic sequence kernel, GMM supervector kernel, GMM-UBM mean interval kernel (GUMI) and intermediate matching kernel. Recently, the pyramid match kernel (PMK) using grids in the feature space as histogram bins and vocabulary-guided PMK (VGPMK) using clusters in the feature space as histogram bins have been proposed for recognition of objects in an image represented as a set of local feature vectors. In PMK, a set of feature vectors is mapped onto a multi-resolution histogram pyramid. The kernel is computed between a pair of examples by comparing the pyramids using a weighted histogram intersection function at each level of pyramid. We propose to use the PMK-based SVM classifier for speaker identification and verification from the speech signal of an utterance represented as a set of local feature vectors. The main issue in building the PMK-based SVM classifier is construction of a pyramid of histograms. We first propose to form hard clusters, using k-means clustering method, with increasing number of clusters at different levels of pyramid to design the codebook-based PMK (CBPMK). Then we propose the GMM-based PMK (GMMPMK) that uses soft clustering. We compare the performance of the GMM-based approaches, and the PMK and other dynamic kernel SVM-based approaches to speaker identification and verification. The 2002 and 2003 NIST speaker recognition corpora are used in evaluation of different approaches to speaker identification and verification. Results of our studies show that the dynamic kernel SVM-based approaches give a significantly better performance than the state-of-the-art GMM-based approaches. For speaker recognition task, the GMMPMK-based SVM gives a performance that is better than that of SVMs using many other dynamic kernels and comparable to that of SVMs using state-of-the-art dynamic kernel, GUMI kernel. The storage requirements of the GMMPMK-based SVMs are less than that of SVMs using any other dynamic kernel.
引用
收藏
页码:365 / 379
页数:14
相关论文
共 50 条
  • [1] Speaker recognition using pyramid match kernel based support vector machines
    Dileep, A.
    Sekhar, C.
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2012, 15 (03) : 365 - 379
  • [2] HMM BASED PYRAMID MATCH KERNEL FOR CLASSIFICATION OF SEQUENTIAL PATTERNS OF SPEECH USING SUPPORT VECTOR MACHINES
    Dileep, A. D.
    Sekhar, C. Chandra
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 3562 - 3566
  • [3] Based on Radial Basis Kernel Function of Support Vector Machines for Speaker Recognition
    Ye Zhihua
    Li Honglian
    [J]. 2012 5TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING (CISP), 2012, : 1584 - 1587
  • [4] Using Polynomial Kernel Support Vector Machines for Speaker Verification
    Yaman, Sibel
    Pelecanos, Jason
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (09) : 901 - 904
  • [5] Speaker recognition using continuous density support vector machines
    Xin, D
    Wu, ZH
    [J]. ELECTRONICS LETTERS, 2001, 37 (17) : 1099 - 1101
  • [6] Phonetic speaker recognition with support vector machines
    Campbell, WM
    Campbell, JP
    Reynolds, DA
    Jones, DA
    Leek, TR
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1377 - 1384
  • [7] Support vector machines for speaker and language recognition
    Campbell, WM
    Campbell, JP
    Reynolds, DA
    Singer, E
    Torres-Carrasquillo, PA
    [J]. COMPUTER SPEECH AND LANGUAGE, 2006, 20 (2-3): : 210 - 229
  • [8] Speaker Recognition from Coded Speech Using Support Vector Machines
    Janicki, Artur
    Staroszczyk, Tomasz
    [J]. TEXT, SPEECH AND DIALOGUE, TSD 2011, 2011, 6836 : 291 - 298
  • [9] Efficient Parameterization for Automatic Speaker Recognition Using Support Vector Machines
    Chakroun, Rania
    Frikha, Mondher
    Zouari, Leila Beltaifa
    [J]. INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA 2016), 2017, 557 : 659 - 666
  • [10] Support Vector machines for Automatic target recognition using wavelet kernel
    Zhao, Jiong
    Fan, Yang-Yu
    Liu, Yuan-Kui
    [J]. 2007 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, VOLS 1-4, PROCEEDINGS, 2007, : 1424 - 1427