Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition

被引:0
|
作者
Wang, Shuai [1 ]
Huang, Zili [1 ]
Qian, Yanmin [1 ]
Yu, Kai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Brain Sci & Technol Res Ctr, Key Lab Shanghai Educ Commiss Intelligent Interac, SpeechLab,Dept Comp Sci & Engn, Shanghai, Peoples R China
关键词
robust speaker verification; LDA; PLDA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Linear Discriminant Analysis (LDA) has been used as a standard post-processing procedure in many state-of-the-art speaker recognition tasks. Through maximizing the inter-speaker difference and minimizing the intra-speaker variation, LDA projects i-vectors to a lower-dimensional and more discriminative subspace. In this paper, we propose a neural network based compensation scheme(termed as deep discriminant analysis, DDA) for i-vector based speaker recognition, which shares the idea with LDA. Optimized against softmax loss and center loss at the same time, the proposed method learns a more compact and discriminative embedding space. Compared with the Gaussian distribution assumption of data and the learned linear projection in LDA, the proposed method doesn't pose any assumptions on data and can learn a non-linear projection function. Experiments are carried out on a short-duration text-independent dataset based on the SRE Corpus, noticeable performance improvement can be observed against the normal LDA or PLDA methods.
引用
收藏
页码:195 / 199
页数:5
相关论文
共 50 条
  • [1] Generalized Discriminant Analysis (GDA) for Improved i-Vector Based Speaker Recognition
    Bahmaninezhad, Fahimeh
    Hansen, John H. L.
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3643 - 3647
  • [2] Geometric Discriminant Analysis for I-vector Based Speaker Verification
    Xu, Can
    Chen, Xianhong
    He, Liang
    Liu, Jia
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1636 - 1640
  • [3] DEEP BELIEF NETWORKS FOR I-VECTOR BASED SPEAKER RECOGNITION
    Ghahabi, Omid
    Hernando, Javier
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [4] I-VECTOR/PLDA SPEAKER RECOGNITION USING SUPPORT VECTORS WITH DISCRIMINANT ANALYSIS
    Bahmaninezhad, Fahimeh
    Hansen, John H. L.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5410 - 5414
  • [5] Full multicondition training for robust i-vector based speaker recognition
    Ribas, Dayana
    Vincent, Emmanuel
    Ramon Calvo, Jose
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1057 - 1061
  • [6] I-vector Based Speaker Gender Recognition
    Wang, Minghe
    Chen, Ying
    Tang, Zhenmin
    Zhang, Erhua
    [J]. 2015 IEEE ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2015, : 729 - 732
  • [7] AN IMPROVED UNCERTAINTY PROPAGATION METHOD FOR ROBUST I-VECTOR BASED SPEAKER RECOGNITION
    Ribas, Dayana
    Vincent, Emmanuel
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6331 - 6335
  • [8] i-vector Based Speaker Recognition on Short Utterances
    Kanagasundaram, Ahilan
    Vogt, Robbie
    Dean, David
    Sridharan, Sridha
    Mason, Michael
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2352 - +
  • [9] Discriminant Analysis Methods Comparison in I-Vector Space for Speaker Verification
    Mohammadi, Mohsen
    Mohammadi, Hamid Reza Sadegh
    [J]. 2018 9TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2018, : 166 - 172
  • [10] A NOISE ROBUST I-VECTOR EXTRACTOR USING VECTOR TAYLOR SERIES FOR SPEAKER RECOGNITION
    Lei, Yun
    Burget, Lukas
    Scheffer, Nicolas
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6788 - 6791