An SVM Kernel With GMM-Supervector Based on the Bhattacharyya Distance for Speaker Recognition

被引:60
|
作者
You, Chang Huai [1 ]
Lee, Kong Aik [1 ]
Li, Haizhou [1 ]
机构
[1] ASTAR, Agcy Sci Technol & Res, Inst Infocomm Res, I2R, Singapore 138632, Singapore
关键词
Gaussian mixture model; National Institute of Standards and Technology (NIST) evaluation; speaker recognition; supervector; support vector machine; SUPPORT VECTOR MACHINES;
D O I
10.1109/LSP.2008.2006711
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Gaussian mixture model (GMM) and support vector machine (SVM) have become popular classifiers in text-independent speaker recognition. A GMM-supervector characterizes a speaker's voice with the parameters of GMM, which include mean vectors, covariance matrices, and mixture weights. GMM-supervector SVM benefits from both GMM and SVM frameworks to achieve the state-of-the-art performance. Conventional Kullback-Leibler (KL) kernel in GMM-supervector SVM classifier limits the adaptation of GMM to mean value and leaves covariance unchanged. In this letter, we introduce the GMM-UBM mean interval (GUMI) concept based on the Bhattacharyya distance. This leads to a new kernel for SVM classifier. Comparing with the KL kernel, the new kernel allows us to exploit the information not only from the mean but also from the covariance. We demonstrate the effectiveness of the new kernel on the 2006 National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) dataset.
引用
收藏
页码:49 / 52
页数:4
相关论文
共 50 条
  • [31] A non-linear GMM KL and GUMI kernel for SVM using GMM-UBM supervector in home acoustic event classification
    Kim, J.Y. (beyond@jnu.ac.kr), 1791, Institute of Electronics, Information and Communication, Engineers, IEICE (E97-A):
  • [32] Performances Evaluation of GMM-UBM and GMM-SVM for Speaker Recognition in Realistic World
    Asbai, Nassim
    Amrouche, Abderrahmane
    Debyeche, Mohamed
    NEURAL INFORMATION PROCESSING, PT II, 2011, 7063 : 284 - 291
  • [33] Speaker Recognition and Speech Emotion Recognition Based on GMM
    Xu, Shupeng
    Liu, Yan
    Liu, Xiping
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON ELECTRIC AND ELECTRONICS, 2013, : 434 - 436
  • [34] Improvement in Supervector Linear Kernel SVM for Speaker Identification Using Feature Enhancement and Training Length Adjustment
    So, Byung-Min
    Kim, Kyung Wha
    Kim, Min-Seok
    Yang, Ii-Ho
    Kim, Myung-Jae
    Yu, Ha-Jin
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2011, 30 (06): : 330 - 336
  • [35] Speaker verification normalization sequence kernel based on Gaussian mixture model super-vector and Bhattacharyya distance
    Xing, YuJuan
    Tan, Ping
    Wang, Xin
    JOURNAL OF LOW FREQUENCY NOISE VIBRATION AND ACTIVE CONTROL, 2021, 40 (01) : 60 - 71
  • [36] Speaker recognition based on the combination of GMM and SVDD
    Zhou, Yuhuan
    Zhang, Xiongwei
    Wang, Jinming
    Gong, Yong
    Zhou, Yi
    PRZEGLAD ELEKTROTECHNICZNY, 2011, 87 (03): : 329 - 332
  • [37] COMPARISON BETWEEN GMM-SVM SEQUENCE KERNEL AND GMM: APPLICATION TO SPEECH EMOTION RECOGNITION
    Trabelsi, I.
    Ben Ayed, D.
    Ellouze, N.
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2016, 11 (09): : 1221 - 1233
  • [38] Secondary classification for GMM based speaker recognition
    Pelecanos, Jason
    Povey, Dan
    Ramaswamy, Ganesh
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 109 - 112
  • [39] Speaker Recognition Based on GMM with an Embedded TDNN
    Chen, Cunbao
    Zhao, Li
    NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2009, 5864 : 746 - 753
  • [40] A hybrid system based on GMM-SVM for Speaker Identification
    Chakroun, Rania
    Zouari, Leila Beltaifa
    Frikha, Mondher
    Ben Hamida, Ahmed
    2015 15TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2015, : 654 - 658