DEEP NEURAL NETWORK DRIVEN MIXTURE OF PLDA FOR ROBUST I-VECTOR SPEAKER VERIFICATION

被引:0
|
作者
Li, Na [1 ]
Mak, Man-Wai [1 ]
Chien, Jen-Tzung [2 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Hong Kong, Peoples R China
[2] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu, Taiwan
关键词
Speaker verification; i-vector; mixture of PLDA; deep neural networks; SNR mismatch;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In speaker recognition, the mismatch between the enrollment and test utterances due to noise with different signal-to-noise ratios (SNRs) is a great challenge. Based on the observation that noise-level variability causes the i-vectors to form heterogeneous clusters, this paper proposes using an SNR-aware deep neural network (DNN) to guide the training of PLDA mixture models. Specifically, given an i-vector, the SNR posterior probabilities produced by the DNN are used as the posteriors of indicator variables of the mixture model. As a result, the proposed model provides a more reasonable soft division of the i-vector space compared to the conventional mixture of PLDA. During verification, given a test trial, the marginal likelihoods from individual PLDA models are linearly combined by the posterior probabilities of SNR levels computed by the DNN. Experimental results for SNR mismatch tasks based on NIST 2012 SRE suggest that the proposed model is more effective than PLDA and conventional mixture of PLDA for handling heterogeneous corpora.
引用
收藏
页码:186 / 191
页数:6
相关论文
共 50 条
  • [1] DEEP NEURAL NETWORK BASED DISCRIMINATIVE TRAINING FOR I-VECTOR/PLDA SPEAKER VERIFICATION
    Zheng Tieran
    Han Jiqing
    Zheng Guibin
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5354 - 5358
  • [2] Fast Scoring for Mixture of PLDA in I-Vector/PLDA Speaker Verification
    Mak, Man-Wai
    [J]. 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 587 - 593
  • [3] PLDA Modeling in I-Vector and Supervector Space for Speaker Verification
    Jiang, Ye
    Lee, Kong Aik
    Tang, Zhenmin
    Ma, Bin
    Larcher, Anthony
    Li, Haizhou
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1678 - 1681
  • [4] Non-linear PLDA for i-Vector Speaker Verification
    Novoselov, Sergey
    Pekhovsky, Timur
    Kudashev, Oleg
    Mendelev, Valentin
    Prudnikov, Alexey
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 214 - 218
  • [5] Nonparametrically trained PLDA for short duration i-vector speaker verification
    Khosravani, Abbas
    Homayounpour, Mohammad M.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2018, 52 : 105 - 122
  • [6] DNN-Driven Mixture of PLDA for Robust Speaker Verification
    Li, Na
    Mak, Man-Wai
    Chien, Jen-Tzung
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1371 - 1383
  • [7] NORMALIZATION OF TOTAL VARIABILITY MATRIX FOR I-VECTOR/PLDA SPEAKER VERIFICATION
    Rao, Wei
    Mak, Man-Wai
    Lee, Kong-Aik
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4180 - 4184
  • [8] I-VECTOR KULLBACK-LEIBLER DIVISIVE NORMALIZATION FOR PLDA SPEAKER VERIFICATION
    Pan, Yilin
    Zheng, Tieran
    Chen, Chen
    [J]. 2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 56 - 60
  • [9] Deep neural network based i-vector mapping for speaker verification using short utterances
    Guo, Jinxi
    Xu, Ning
    Qian, Kailun
    Shi, Yang
    Xu, Kaiyuan
    Wu, Yingnian
    Alwan, Abeer
    [J]. SPEECH COMMUNICATION, 2018, 105 : 92 - 102
  • [10] Mixture of PLDA Models in I-Vector Space for Gender-Independent Speaker Recognition
    Senoussaoui, Mohammed
    Kenny, Patrick
    Bruemmer, Niko
    de Villiers, Edward
    Dumouchel, Pierre
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 32 - +