Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition

被引:18
|
作者
Cumani, Sandro [1 ]
Laface, Pietro [1 ]
机构
[1] Politecn Torino, Dipartimento Automat & Informat, I-10143 Turin, Italy
关键词
Density function transformation; i-vectors; probabilistic linear discriminant analysis; speaker recognition;
D O I
10.1109/TASLP.2017.2674966
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes to estimate parametric nonlinear transformations of i-vectors for speaker recognition systems based on probabilistic linear discriminant analysis (PLDA) classification. The Gaussian PLDA model assumes that the i-vectors are distributed according to the standard normal distribution. However, it has been shown that the i-vectors are better modeled, for example, by Heavy-Tailed distributions, and that significant improvement of the classification performance can be obtained by whitening and length normalizing the i-vectors. In this paper, we propose to transform the i-vectors so that their distribution becomes more suitable to discriminate speakers using the PLDA model. This is performed by means of a sequence of affine and nonlinear transformations whose parameters are obtained by maximum likelihood estimation on the development set. Another contribution of this paper is the reduction of the mismatch between the development and evaluation i-vector length distributions by means of a scaling factor tuned for the estimated i-vector distribution, rather than by means of a blind length normalization. Relative improvement between 7% and 14% of the detection cost function was obtained with the proposed technique on the NIST SRE-2010 and SRE-2012 evaluation datasets, using both the traditional GMM/UBM and the hybrid DNN/GMM-based systems.
引用
收藏
页码:908 / 919
页数:12
相关论文
共 50 条
  • [1] END-TO-END DNN BASED SPEAKER RECOGNITION INSPIRED BY I-VECTOR AND PLDA
    Rohdin, Johan
    Silnova, Anna
    Diez, Mireia
    Plchot, Oldrich
    Matejka, Pavel
    Burget, Lukas
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4874 - 4878
  • [2] Effect of multicondition training on i-vector PLDA configurations for speaker recognition
    Rajan, Padmanabhan
    Kinnunen, Tomi
    Hautamaki, Ville
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3661 - 3664
  • [3] Fast Scoring for Mixture of PLDA in I-Vector/PLDA Speaker Verification
    Mak, Man-Wai
    [J]. 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 587 - 593
  • [4] I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS
    Bahmaninezhad, Fahimeh
    Hansen, John H. L.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5410 - 5414
  • [5] Mixture of PLDA Models in I-Vector Space for Gender-Independent Speaker Recognition
    Senoussaoui, Mohammed
    Kenny, Patrick
    Bruemmer, Niko
    de Villiers, Edward
    Dumouchel, Pierre
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 32 - +
  • [6] I-vector Based Speaker Gender Recognition
    Wang, Minghe
    Chen, Ying
    Tang, Zhenmin
    Zhang, Erhua
    [J]. 2015 IEEE ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2015, : 729 - 732
  • [7] PLDA Modeling in I-Vector and Supervector Space for Speaker Verification
    Jiang, Ye
    Lee, Kong Aik
    Tang, Zhenmin
    Ma, Bin
    Larcher, Anthony
    Li, Haizhou
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1678 - 1681
  • [8] SPEAKER DIARIZATION WITH PLDA I-VECTOR SCORING AND UNSUPERVISED CALIBRATION
    Sell, Gregory
    Garcia-Romero, Daniel
    [J]. 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014, 2014, : 413 - 417
  • [9] Non-linear PLDA for i-Vector Speaker Verification
    Novoselov, Sergey
    Pekhovsky, Timur
    Kudashev, Oleg
    Mendelev, Valentin
    Prudnikov, Alexey
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 214 - 218
  • [10] i-vector Based Speaker Recognition on Short Utterances
    Kanagasundaram, Ahilan
    Vogt, Robbie
    Dean, David
    Sridharan, Sridha
    Mason, Michael
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2352 - +