Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition

被引:4
|
作者
You, Chang Huai [1 ]
Li, Haizhou [1 ]
Lee, Kong Aik [1 ]
机构
[1] ASTAR, Inst Infocomm Res, Singapore, Singapore
来源
COMPUTER SPEECH AND LANGUAGE | 2015年 / 30卷 / 01期
关键词
Maximum a posteriori; Supervector; Gaussian mixture model; Support vector machine; DISTANCE; KERNEL;
D O I
10.1016/j.csl.2014.09.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies the relevance factor in maximum a posteriori (MAP) adaptation of Gaussian mixture model (GMM) for speaker and language recognition. Knowing that relevance factor determines how much the observed training data influence the model adaptation, thus the resulting GMM model, it is believed that more effective modeling can be achieved if the relevance factor is adaptive to the corresponding data. We therefore provide a mathematic derivation for the estimation of relevance factor. GMM supervector support vector machine (SVM) with nuisance attribute projection (NAP) (GMM-NAP-SVM) has been reported to be effective and reliable for speaker and language recognition. Being a discriminative classifier in nature, a GMM-NAP-SVM system is sensitive to the magnitude and direction of a supervector in the high dimensional space. However, when characterizing a speech utterance with GMM supervector estimated through MAP, we observe that the resulting supervector is undesirably affected by the varying duration of the utterance. We propose an adaptive relevance factor that adapts to the duration to mitigate the variability effect due to the length of utterance. We give a systematic investigation on different types of relevance factor of MAP in different applicatively platforms. We show the efficacy of the data-dependent as well as adaptive relevance factors on the National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) 2008 and language recognition evaluation (LRE) 2009 and 2011 tasks respectively. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:116 / 134
页数:19
相关论文
共 50 条
  • [1] Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition
    You, Chang Huai
    Li, Haizhou
    Ma, Bin
    Lee, Kong Aik
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2063 - 2066
  • [2] Study on the Relevance Factor of Maximum a Posteriori with GMM for Language Recognition
    You, Chang Huai
    Li, Haizhou
    Lee, Kong Aik
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2904 - 2907
  • [3] A Hybrid Modeling Strategy for GMM-SVM Speaker Recognition with Adaptive Relevance Factor
    You, Chang Huai
    Li, Haizhou
    Lee, Kong Aik
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2754 - 2757
  • [4] Maximum likelihood and maximum a posteriori adaptation for distributed speaker recognition systems
    Sit, CH
    Mak, MW
    Kung, SY
    [J]. BIOMETRIC AUTHENTICATION, PROCEEDINGS, 2004, 3072 : 640 - 647
  • [5] SVM based Speaker Recognition Using Maximum A Posteriori Linear Regression
    Zhang, Xiang
    Zhao, Qingwei
    Yan, Yonghong
    [J]. ICECT: 2009 INTERNATIONAL CONFERENCE ON ELECTRONIC COMPUTER TECHNOLOGY, PROCEEDINGS, 2009, : 438 - +
  • [6] Scatter Difference NAP for SVM Speaker Recognition
    Baker, Brendan
    Vogt, Robbie
    McLaren, Mitchell
    Sridharan, Sridha
    [J]. ADVANCES IN BIOMETRICS, 2009, 5558 : 464 - 473
  • [7] A STUDY ON GMM-SVM WITH ADAPTIVE RELEVANCE FACTOR AND ITS COMPARISON WITH I-VECTOR AND JFA FOR SPEAKER RECOGNITION
    You, Chang Huai
    Li, Haizhou
    Ma, Bin
    Lee, Kong Aik
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7683 - 7687
  • [8] A Method to Integrate GMM, SVM and DTW for Speaker Recognition
    Ding, Ing-Jr
    Yen, Chih-Ta
    Ou, Da-Cheng
    [J]. INTERNATIONAL JOURNAL OF ENGINEERING AND TECHNOLOGY INNOVATION, 2014, 4 (01) : 38 - 47
  • [9] SVM based speaker selection using GMM supervector for rapid speaker adaptation
    Wang, Jian
    Lei, Jianjun
    Guo, Jun
    Yang, Zhen
    [J]. SIMULATED EVOLUTION AND LEARNING, PROCEEDINGS, 2006, 4247 : 617 - 624
  • [10] Maximum a posteriori adaptation of the centroid model for speaker verification
    Hautamaki, Ville
    Kinnunen, Tomi
    Karkkainen, Ismo
    Saastamoinen, Juhani
    Tuononen, Marko
    Franti, Pasi
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2008, 15 (162-165) : 162 - 165