Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition

被引:0
|
作者
Wang, Shuai [1 ]
Huang, Zili [1 ]
Qian, Yanmin [1 ]
Yu, Kai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Brain Sci & Technol Res Ctr, Key Lab Shanghai Educ Commiss Intelligent Interac, SpeechLab,Dept Comp Sci & Engn, Shanghai, Peoples R China
关键词
robust speaker verification; LDA; PLDA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Linear Discriminant Analysis (LDA) has been used as a standard post-processing procedure in many state-of-the-art speaker recognition tasks. Through maximizing the inter-speaker difference and minimizing the intra-speaker variation, LDA projects i-vectors to a lower-dimensional and more discriminative subspace. In this paper, we propose a neural network based compensation scheme(termed as deep discriminant analysis, DDA) for i-vector based speaker recognition, which shares the idea with LDA. Optimized against softmax loss and center loss at the same time, the proposed method learns a more compact and discriminative embedding space. Compared with the Gaussian distribution assumption of data and the learned linear projection in LDA, the proposed method doesn't pose any assumptions on data and can learn a non-linear projection function. Experiments are carried out on a short-duration text-independent dataset based on the SRE Corpus, noticeable performance improvement can be observed against the normal LDA or PLDA methods.
引用
收藏
页码:195 / 199
页数:5
相关论文
共 50 条
  • [21] Nonparametrically Trained Probabilistic Linear Discriminant Analysis for i-Vector Speaker Verification
    Khosravani, Abbas
    Homayounpour, Mohammad Mehdi
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1019 - 1023
  • [22] Applying Emotional Factor Analysis and I-Vector to Emotional Speaker Recognition
    Chen, Li
    Yang, Yingchun
    [J]. BIOMETRIC RECOGNITION: CCBR 2011, 2011, 7098 : 174 - 179
  • [23] NEAREST NEIGHBOR BASED I-VECTOR NORMALIZATION FOR ROBUST SPEAKER RECOGNITION UNDER UNSEEN CHANNEL CONDITIONS
    Zhu, Weizhong
    Sadjadi, Seyed Omid
    Pelecanos, Jason W.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4684 - 4688
  • [24] Speaker Recognition Based on i-Vector and Improved Local Preserving Projection
    Wu, Di
    [J]. PROCEEDINGS OF THE 2015 CHINESE INTELLIGENT AUTOMATION CONFERENCE: INTELLIGENT INFORMATION PROCESSING, 2015, 336 : 115 - 121
  • [25] Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition
    Cumani, Sandro
    Laface, Pietro
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (04) : 908 - 919
  • [26] I-Vector Speaker and Language Recognition System on Android
    Vazquez-Machado, Christian
    Colon-Hernandez, Pedro
    Torres-Carrasquillo, Pedro A.
    [J]. 2016 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2016,
  • [27] DURATION MISMATCH COMPENSATION FOR I-VECTOR BASED SPEAKER RECOGNITION SYSTEMS
    Hasan, Taufiq
    Saeidi, Rahim
    Hansen, John H. L.
    van Leeuwen, David A.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7663 - 7667
  • [28] Generalizing I-Vector Estimation for Rapid Speaker Recognition
    Xu, Longting
    Lee, Kong Aik
    Li, Haizhou
    Yang, Zhen
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (04) : 749 - 759
  • [29] Sparsity Analysis and Compensation for i-Vector Based Speaker Verification
    Li, Wei
    Fu, Tian Fan
    Zhu, Jie
    Chen, Ning
    [J]. SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 381 - 388
  • [30] DEEP NEURAL NETWORK DRIVEN MIXTURE OF PLDA FOR ROBUST I-VECTOR SPEAKER VERIFICATION
    Li, Na
    Mak, Man-Wai
    Chien, Jen-Tzung
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 186 - 191