Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition

被引:18
|
作者
Cumani, Sandro [1 ]
Laface, Pietro [1 ]
机构
[1] Politecn Torino, Dipartimento Automat & Informat, I-10143 Turin, Italy
关键词
Density function transformation; i-vectors; probabilistic linear discriminant analysis; speaker recognition;
D O I
10.1109/TASLP.2017.2674966
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper proposes to estimate parametric nonlinear transformations of i-vectors for speaker recognition systems based on probabilistic linear discriminant analysis (PLDA) classification. The Gaussian PLDA model assumes that the i-vectors are distributed according to the standard normal distribution. However, it has been shown that the i-vectors are better modeled, for example, by Heavy-Tailed distributions, and that significant improvement of the classification performance can be obtained by whitening and length normalizing the i-vectors. In this paper, we propose to transform the i-vectors so that their distribution becomes more suitable to discriminate speakers using the PLDA model. This is performed by means of a sequence of affine and nonlinear transformations whose parameters are obtained by maximum likelihood estimation on the development set. Another contribution of this paper is the reduction of the mismatch between the development and evaluation i-vector length distributions by means of a scaling factor tuned for the estimated i-vector distribution, rather than by means of a blind length normalization. Relative improvement between 7% and 14% of the detection cost function was obtained with the proposed technique on the NIST SRE-2010 and SRE-2012 evaluation datasets, using both the traditional GMM/UBM and the hybrid DNN/GMM-based systems.
引用
下载
收藏
页码:908 / 919
页数:12
相关论文
共 50 条
  • [41] Generalized cosine similarity in I-vector based automatic speaker recognition systems
    Drgas, Szymon
    Dabrowski, Adam
    2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 73 - 77
  • [42] Scalable I-vector Concatenation for PLDA based Language Identification System
    Irtza, Saad
    Bavattichalil, Haris
    Sethu, Vidhyasaharan
    Ambikairajah, Eliathamby
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1182 - 1185
  • [43] GENDER INDEPENDENT DISCRIMINATIVE SPEAKER RECOGNITION IN I-VECTOR SPACE
    Cumani, Sandro
    Glembek, Ondrej
    Bruemmer, Niko
    de Villiers, Edward
    Laface, Pietro
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4361 - 4364
  • [44] ADDITIVE NOISE COMPENSATION IN THE I-VECTOR SPACE FOR SPEAKER RECOGNITION
    Ben Kheder, Waad
    Matrouf, Driss
    Bonastre, Jean-Francois
    Ajili, Moez
    Bousquet, Pierre-Michel
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4190 - 4194
  • [45] I-vector based speaker recognition using advanced channel compensation techniques
    Kanagasundaram, Ahilan
    Dean, David
    Sridharan, Sridha
    McLaren, Mitchell
    Vogt, Robbie
    COMPUTER SPEECH AND LANGUAGE, 2014, 28 (01): : 121 - 140
  • [46] Speaker recognition based on discriminant i-vector local distance preserving projection
    Li, Zhiyi
    He, Liang
    Zhang, Weiqiang
    Liu, Jia
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2012, 52 (05): : 598 - 601
  • [47] AN IMPROVED UNCERTAINTY PROPAGATION METHOD FOR ROBUST I-VECTOR BASED SPEAKER RECOGNITION
    Ribas, Dayana
    Vincent, Emmanuel
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6331 - 6335
  • [48] DEALING WITH ADDITIVE NOISE IN SPEAKER RECOGNITION SYSTEMS BASED ON I-VECTOR APPROACH
    Matrouf, D.
    Ben Kheder, W.
    Bousquet, P-M.
    Ajili, M.
    Bonastre, J-F.
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2092 - 2096
  • [49] Analysis of I-vector Length Normalization in Speaker Recognition Systems
    Garcia-Romero, Daniel
    Espy-Wilson, Carol Y.
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 256 - 259
  • [50] Language Identification Using PLDA Based on I-Vector in Noisy Environment
    Rai, Manish Kumar
    Neetish
    Fahad, Md. S.
    Yadav, Jainath
    Rao, K. Sreenivasa
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 1014 - 1020