IMPROVED SPEAKER RECOGNITION WHEN USING I-VECTORS FROM MULTIPLE SPEECH SOURCES

被引:0
|
作者
McLaren, Mitchell [1 ]
van Leeuwen, David [1 ]
机构
[1] Radboud Univ Nijmegen, Ctr Language & Speech Technol, Nijmegen, Netherlands
关键词
speaker recognition; i-vector; total variability; source conditions; linear discriminant analysis;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The concept of speaker recognition using i-vectors was recently introduced offering state-of-the-art performance. An i-vector is a compact representation of a speaker's utterance after projection into a low-dimensional, total variability subspace trained using factor analysis. A secondary process involving linear discriminant analysis (LDA) is then used to improve the discrimination of i-vectors from different speakers. The newness of this technology invokes the question as to the best way to train the total variability subspace and LDA matrix when using speech collected from distinctly different sources. This paper presents a comparative study of a number of subspace training techniques and a novel source-normalised-and-weighted LDA algorithm for the purpose of improving i-vector-based speaker recognition under mis-matched evaluation conditions. Results from the NIST 2010 speaker recognition evaluation (SRE) suggest that accounting for source conditions in the LDA matrix as opposed to the total variability subspace training regime provides improved robustness to mis-matched evaluation conditions.
引用
收藏
页码:5460 / 5463
页数:4
相关论文
共 50 条
  • [21] From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification
    Rajan, Padmanabhan
    Afanasyev, Anton
    Hatitamaki, Ville
    Kinnunen, Tomi
    DIGITAL SIGNAL PROCESSING, 2014, 31 : 93 - 101
  • [22] Speaker Recognition With Random Digit Strings Using Uncertainty Normalized HMM-Based i-Vectors
    Maghsoodi, Nooshin
    Sameti, Hossein
    Zeinal, Hossein
    Stafylakis, Themos
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (11) : 1815 - 1825
  • [23] Speaker Verification using Sparse Representations on Total Variability I-Vectors
    Li, Ming
    Zhang, Xiang
    Yan, Yonghong
    Narayanan, Shrikanth
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2740 - +
  • [24] HANDLING I-VECTORS FROM DIFFERENT RECORDING CONDITIONS USING MULTI-CHANNEL SIMPLIFIED PLDA IN SPEAKER RECOGNITION
    Villalba, Jesus
    Lleida, Eduardo
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6763 - 6767
  • [25] Multitaper MFCC and PLP features for speaker verification using i-vectors
    Alam, Md Jahangir
    Kinnunen, Tomi
    Kenny, Patrick
    Ouellet, Pierre
    O'Shaughnessy, Douglas
    SPEECH COMMUNICATION, 2013, 55 (02) : 237 - 251
  • [26] Speaker Adaptation of Neural Network Acoustic Models Using I-Vectors
    Saon, George
    Soltau, Hagen
    Nahamoo, David
    Picheny, Michael
    2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 55 - 59
  • [27] Audio-Visual Speech Separation Using I-Vectors
    Luo, Yiyu
    Wang, Jing
    Wang, Xinyao
    Wen, Liang
    Wang, Lizhong
    2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 276 - 280
  • [28] Linguistically-constrained formant-based i-vectors for automatic speaker recognition
    Franco-Pedroso, Javier
    Gonzalez-Rodriguez, Joaquin
    SPEECH COMMUNICATION, 2016, 76 : 61 - 81
  • [29] Denoised Senone I-Vectors for Robust Speaker Verification
    Tan, Zhili
    Mak, Man-Wai
    Mak, Brian Kan-Wing
    Zhu, Yingke
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (04) : 820 - 830
  • [30] DISCRIMINATIVELY TRAINED BAYESIAN SPEAKER COMPARISON OF I-VECTORS
    Borgstroem, Bengt J.
    McCree, Alan
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7659 - 7662