Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition

被引:4
|
作者
You, Chang Huai [1 ]
Li, Haizhou [1 ]
Lee, Kong Aik [1 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore, Singapore
来源
COMPUTER SPEECH AND LANGUAGE | 2015年 / 30卷 / 01期
关键词
Maximum a posteriori; Supervector; Gaussian mixture model; Support vector machine; DISTANCE; KERNEL;
D O I
10.1016/j.csl.2014.09.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies the relevance factor in maximum a posteriori (MAP) adaptation of Gaussian mixture model (GMM) for speaker and language recognition. Knowing that relevance factor determines how much the observed training data influence the model adaptation, thus the resulting GMM model, it is believed that more effective modeling can be achieved if the relevance factor is adaptive to the corresponding data. We therefore provide a mathematic derivation for the estimation of relevance factor. GMM supervector support vector machine (SVM) with nuisance attribute projection (NAP) (GMM-NAP-SVM) has been reported to be effective and reliable for speaker and language recognition. Being a discriminative classifier in nature, a GMM-NAP-SVM system is sensitive to the magnitude and direction of a supervector in the high dimensional space. However, when characterizing a speech utterance with GMM supervector estimated through MAP, we observe that the resulting supervector is undesirably affected by the varying duration of the utterance. We propose an adaptive relevance factor that adapts to the duration to mitigate the variability effect due to the length of utterance. We give a systematic investigation on different types of relevance factor of MAP in different applicatively platforms. We show the efficacy of the data-dependent as well as adaptive relevance factors on the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) 2008 and language recognition evaluation (LRE) 2009 and 2011 tasks respectively. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:116 / 134
页数:19
相关论文
共 50 条
  • [21] Text-independent speaker recognition using probabilistic SVM with GMM adjustment
    Hou, FL
    Wang, BX
    [J]. 2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 305 - 308
  • [22] GMM-SVM Kernel With a Bhattacharyya-Based Distance for Speaker Recognition
    You, Chang Huai
    Lee, Kong Aik
    Li, Haizhou
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1300 - 1312
  • [23] Structural MAP Adaptation in GMM-Supervector based Speaker Recognition
    Ferras, Marc
    Shinoda, Koichi
    Furui, Sadaoki
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5432 - 5435
  • [24] A maximum A posteriori approach to speaker adaptation using the trended hidden Markov model
    Chengalvarayan, R
    Deng, L
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (05): : 549 - 557
  • [25] Eigenspace-based maximum a posteriori linear regression for rapid speaker adaptation
    Chen, KT
    Wang, HM
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 317 - 320
  • [26] New scheme based on GMM-PCA-SVM modelling for automatic speaker recognition
    Zergat, Kawthar
    Amrouche, Abderrahmane
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2014, 17 (04) : 373 - 381
  • [27] A NEW STUDY OF GMM-SVM SYSTEM FOR TEXT-DEPENDENT SPEAKER RECOGNITION
    Sun, Hanwu
    Lee, Kong Aik
    Ma, Bin
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4195 - 4199
  • [28] Comparison of Speaker Adaptation Methods as Feature Extraction for SVM-Based Speaker Recognition
    Ferras, Marc
    Leung, Cheung-Chi
    Barras, Claude
    Gauvain, Jean-Luc
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1366 - 1378
  • [29] Factor Analysis and SVM for Language Recognition
    Verdet, Florian
    Matrouf, Driss
    Bonastre, Jean-Francois
    Hennebert, Jean
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 164 - +
  • [30] ON COMBINING DNN AND GMM WITH UNSUPERVISED SPEAKER ADAPTATION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Liu, Shilin
    Sim, Khe Chai
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,