Co-whitening of i-vectors for short and long duration speaker verification

被引：0

作者：

Xu, Longting ^{[1
]}

Lee, Kong Aik ^{[2
]}

Li, Haizhou ^{[1
]}

Yang, Zhen ^{[3
]}

机构：

[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore

[2] NEC Corp Ltd, Data Sci Res Labs, Tokyo, Japan

[3] Nanjing Univ Posts & Telecommun, Broadband Wireless Commun & Sensor Network Techno, Nanjing, Jiangsu, Peoples R China

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

Speaker recognition; co-whitening; short duration; i-vector; text-independent; canonical correlation analysis;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An i-vector is a fixed-length and low-rank representation of a speech utterance. It has been used extensively in text independent speaker verification. Ideally, speech utterances from the same speaker would map to an unique i-vector. However, this is not the case due to some intrinsic and extrinsic factors like physical condition of the speaker, channel difference, noise and notably the duration of speech utterances. In particular, we found that i-vectors extracted from short utterances exhibit larger variance than that of long utterances. To address the problem, we propose a co-whitening approach, taking into account the duration, while maximizing the correlation between the i-vectors of short and long duration. The proposed co-whitening method was derived based on canonical correlation analysis (CCA). Experimental results on NIST SRE 2010 show that co-whitening method is effective in compensating the duration mismatch, leading to a reduction of up to 13.07% in equal error rate (EER).

引用

页码：1066 / 1070

页数：5

共 50 条

[41] I-vectors and ILP clustering adapted to cross-show speaker diarization
Dupuy, Gregor
Rouvier, Mickael
Meignier, Sylvain
Esteve, Yannick
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2171 - 2174
[42] SPEAKER-PHONETIC VECTOR ESTIMATION FOR SHORT DURATION SPEAKER VERIFICATION
Ma, Jianbo
Sethu, Vidhyasaharan
Ambikairajah, Eliathamby
Lee, Kong Aik
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5264 - 5268
[43] IMPROVED SPEAKER RECOGNITION WHEN USING I-VECTORS FROM MULTIPLE SPEECH SOURCES
McLaren, Mitchell
van Leeuwen, David
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5460 - 5463
[44] Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors
Miao, Yajie
Zhang, Hao
Metze, Florian
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1938 - 1949
[45] SOURCE-NORMALISED-AND-WEIGHTED LDA FOR ROBUST SPEAKER RECOGNITION USING I-VECTORS
McLaren, Mitchell
van Leeuwen, David
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5456 - 5459
[46] Speaker Adaptive Training of Deep Neural Network Acoustic Models Using I-Vectors
Miao, Yajie
Zhang, Hao
Metze, Florian
IEEE Transactions on Audio, Speech and Language Processing, 2015, 23 (11): : 1938 - 1949
[47] Robust Speaker Recognition Using MAP Estimation of Additive Noise in i-vectors Space
Ben Kheder, Waad
Matrouf, Driss
Bousquet, Pierre-Michel
Bonastre, Jean-Francois
Ajili, Moez
STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2014, 2014, 8791 : 97 - 107
[48] VOICE VERIFICATION USING I-VECTORS AND NEURAL NETWORKS WITH LIMITED TRAINING DATA
Mamyrbayev, O. Zh.
Othman, M.
Akhmediyarova, A. T.
Kydyrbekova, A. S.
Mekebayev, N. O.
BULLETIN OF THE NATIONAL ACADEMY OF SCIENCES OF THE REPUBLIC OF KAZAKHSTAN, 2019, (03): : 36 - 43
[49] Accounting For Uncertainty of i-vectors in Speaker Recognition Using Uncertainty Propagation and Modified Imputation
Saeidi, Rahim
Alku, Paavo
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3546 - 3550
[50] Linguistically-constrained formant-based i-vectors for automatic speaker recognition
Franco-Pedroso, Javier
Gonzalez-Rodriguez, Joaquin
SPEECH COMMUNICATION, 2016, 76 : 61 - 81

← 1 2 3 4 5 →