Front-end Channel Compensation using Mixture-dependent Feature Transformations for i-Vector Speaker Recognition

被引:0
|
作者
Hasan, Taufiq [1 ]
Hansen, John H. L. [1 ]
机构
[1] Univ Texas Dallas, CRSS, Eric Jonsson Sch Engn, Richardson, TX 75083 USA
关键词
PCA; GMM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
State-of-the-art session variability compensation for speaker recognition are generally based on various linear statistical models of the Gaussian Mixture Model (GMM) mean super-vectors, while front-end features are only processed by standard normalization techniques. In this study, we propose a front-end channel compensation frame-work using mixture-localized linear transforms that operate before super-vector domain modeling begins. In this approach, local linear transforms are trained for each Gaussian component of a Universal Background Model (UBM), and are applied to acoustic features according to their mixture-wise probabilistic alignment, yielding an operation that is globally non-linear. We examine Principal Component Analysis (PCA), whitening, Linear Discriminant Analysis (LDA) and Nuisance Attribute Projection (NAP) as front-end feature transformations. We also propose a method, Nuisance Attribute Elimination (NAB), which is similar to NAP but performs dimensionality reduction in addition to channel compensation. We show that the proposed frame-work can be readily integrated with a standard i-Vector system by simply applying the transformations on the first order Baum-Welch statistics and transforming the UBM. Experiments performed on the telephone trials of the NIST SRE 2010 demonstrate significant performance gain from the proposed frame-work, especially using LDA as the front-end transformation.
引用
收藏
页码:1090 / 1093
页数:4
相关论文
共 39 条
  • [1] I-vector based speaker recognition using advanced channel compensation techniques
    Kanagasundaram, Ahilan
    Dean, David
    Sridharan, Sridha
    McLaren, Mitchell
    Vogt, Robbie
    COMPUTER SPEECH AND LANGUAGE, 2014, 28 (01): : 121 - 140
  • [2] I-Vector Dependent Feature Space Transformations for Adaptive Speech Recognition
    Li, Xiangang
    Wu, Xihong
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3635 - 3639
  • [3] ADDITIVE NOISE COMPENSATION IN THE I-VECTOR SPACE FOR SPEAKER RECOGNITION
    Ben Kheder, Waad
    Matrouf, Driss
    Bonastre, Jean-Francois
    Ajili, Moez
    Bousquet, Pierre-Michel
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4190 - 4194
  • [4] Analysis of Language Dependent Front-End for Speaker Recognition
    Madikeri, Srikanth
    Dey, Subhadeep
    Motlicek, Petr
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1101 - 1105
  • [5] Neural Networks based Channel Compensation for I-Vector Speaker Verification
    Rao, Wei
    Xiao, Xiong
    Xu, Chenglin
    Xu, Haihua
    Lee, Kong Aik
    Chng, Eng Siong
    Li, Haizhou
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [6] Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition
    Cumani, Sandro
    Laface, Pietro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 908 - 919
  • [7] DURATION MISMATCH COMPENSATION FOR I-VECTOR BASED SPEAKER RECOGNITION SYSTEMS
    Hasan, Taufiq
    Saeidi, Rahim
    Hansen, John H. L.
    van Leeuwen, David A.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7663 - 7667
  • [8] END-TO-END DNN BASED SPEAKER RECOGNITION INSPIRED BY I-VECTOR AND PLDA
    Rohdin, Johan
    Silnova, Anna
    Diez, Mireia
    Plchot, Oldrich
    Matejka, Pavel
    Burget, Lukas
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4874 - 4878
  • [9] I-Vector Extraction Using Speaker Relevancy for Short Duration Speaker Recognition
    Kang, Woo Hyun
    Cho, Won Ik
    Jang, Se Young
    Lee, Hyeon Seung
    Kim, Nam Soo
    IT CONVERGENCE AND SECURITY 2017, VOL 1, 2018, 449 : 79 - 87
  • [10] Front-End Feature Compensation for Noise Robust Speech Emotion Recognition
    Pandharipande, Meghna
    Chakraborty, Rupayan
    Panda, Ashish
    Das, Biswajit
    Kopparapu, Sunil Kumar
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,