Co-whitening of i-vectors for short and long duration speaker verification

被引:0
|
作者
Xu, Longting [1 ]
Lee, Kong Aik [2 ]
Li, Haizhou [1 ]
Yang, Zhen [3 ]
机构
[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[2] NEC Corp Ltd, Data Sci Res Labs, Tokyo, Japan
[3] Nanjing Univ Posts & Telecommun, Broadband Wireless Commun & Sensor Network Techno, Nanjing, Jiangsu, Peoples R China
关键词
Speaker recognition; co-whitening; short duration; i-vector; text-independent; canonical correlation analysis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An i-vector is a fixed-length and low-rank representation of a speech utterance. It has been used extensively in text independent speaker verification. Ideally, speech utterances from the same speaker would map to an unique i-vector. However, this is not the case due to some intrinsic and extrinsic factors like physical condition of the speaker, channel difference, noise and notably the duration of speech utterances. In particular, we found that i-vectors extracted from short utterances exhibit larger variance than that of long utterances. To address the problem, we propose a co-whitening approach, taking into account the duration, while maximizing the correlation between the i-vectors of short and long duration. The proposed co-whitening method was derived based on canonical correlation analysis (CCA). Experimental results on NIST SRE 2010 show that co-whitening method is effective in compensating the duration mismatch, leading to a reduction of up to 13.07% in equal error rate (EER).
引用
收藏
页码:1066 / 1070
页数:5
相关论文
共 50 条
  • [41] I-vectors and ILP clustering adapted to cross-show speaker diarization
    Dupuy, Gregor
    Rouvier, Mickael
    Meignier, Sylvain
    Esteve, Yannick
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2171 - 2174
  • [42] SPEAKER-PHONETIC VECTOR ESTIMATION FOR SHORT DURATION SPEAKER VERIFICATION
    Ma, Jianbo
    Sethu, Vidhyasaharan
    Ambikairajah, Eliathamby
    Lee, Kong Aik
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5264 - 5268
  • [43] IMPROVED SPEAKER RECOGNITION WHEN USING I-VECTORS FROM MULTIPLE SPEECH SOURCES
    McLaren, Mitchell
    van Leeuwen, David
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5460 - 5463
  • [44] Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors
    Miao, Yajie
    Zhang, Hao
    Metze, Florian
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1938 - 1949
  • [45] SOURCE-NORMALISED-AND-WEIGHTED LDA FOR ROBUST SPEAKER RECOGNITION USING I-VECTORS
    McLaren, Mitchell
    van Leeuwen, David
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5456 - 5459
  • [46] Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors
    Miao, Yajie
    Zhang, Hao
    Metze, Florian
    IEEE Transactions on Audio, Speech and Language Processing, 2015, 23 (11): : 1938 - 1949
  • [47] Robust Speaker Recognition Using MAP Estimation of Additive Noise in i-vectors Space
    Ben Kheder, Waad
    Matrouf, Driss
    Bousquet, Pierre-Michel
    Bonastre, Jean-Francois
    Ajili, Moez
    STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2014, 2014, 8791 : 97 - 107
  • [48] VOICE VERIFICATION USING I-VECTORS AND NEURAL NETWORKS WITH LIMITED TRAINING DATA
    Mamyrbayev, O. Zh.
    Othman, M.
    Akhmediyarova, A. T.
    Kydyrbekova, A. S.
    Mekebayev, N. O.
    BULLETIN OF THE NATIONAL ACADEMY OF SCIENCES OF THE REPUBLIC OF KAZAKHSTAN, 2019, (03): : 36 - 43
  • [49] Accounting For Uncertainty of i-vectors in Speaker Recognition Using Uncertainty Propagation and Modified Imputation
    Saeidi, Rahim
    Alku, Paavo
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3546 - 3550
  • [50] Linguistically-constrained formant-based i-vectors for automatic speaker recognition
    Franco-Pedroso, Javier
    Gonzalez-Rodriguez, Joaquin
    SPEECH COMMUNICATION, 2016, 76 : 61 - 81