Co-whitening of i-vectors for short and long duration speaker verification

被引：0

作者：

Xu, Longting ^{[1
]}

Lee, Kong Aik ^{[2
]}

Li, Haizhou ^{[1
]}

Yang, Zhen ^{[3
]}

机构：

[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore

[2] NEC Corp Ltd, Data Sci Res Labs, Tokyo, Japan

[3] Nanjing Univ Posts & Telecommun, Broadband Wireless Commun & Sensor Network Techno, Nanjing, Jiangsu, Peoples R China

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

Speaker recognition; co-whitening; short duration; i-vector; text-independent; canonical correlation analysis;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An i-vector is a fixed-length and low-rank representation of a speech utterance. It has been used extensively in text independent speaker verification. Ideally, speech utterances from the same speaker would map to an unique i-vector. However, this is not the case due to some intrinsic and extrinsic factors like physical condition of the speaker, channel difference, noise and notably the duration of speech utterances. In particular, we found that i-vectors extracted from short utterances exhibit larger variance than that of long utterances. To address the problem, we propose a co-whitening approach, taking into account the duration, while maximizing the correlation between the i-vectors of short and long duration. The proposed co-whitening method was derived based on canonical correlation analysis (CCA). Experimental results on NIST SRE 2010 show that co-whitening method is effective in compensating the duration mismatch, leading to a reduction of up to 13.07% in equal error rate (EER).

引用

页码：1066 / 1070

页数：5

共 50 条

[21] Probabilistic approach using joint long and short session i-vectors modeling to deal with short utterances for speaker recognition
Ben Kheder, Waad
Matrouf, Driss
Ajili, Moez
Bonastre, Jean-Francois
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1830 - 1834
[22] I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicry
Hautamaki, Rosa Gonzalez
Kinnunen, Tomi
Hautamaki, Ville
Leino, Timo
Laukkanen, Anne-Maria
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 930 - 934
[23] Speaker age classification and regression using i-vectors
Grzybowska, Joanna
Kacprzak, Stanislaw
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1402 - 1406
[24] From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification
Rajan, Padmanabhan
Afanasyev, Anton
Hatitamaki, Ville
Kinnunen, Tomi
DIGITAL SIGNAL PROCESSING, 2014, 31 : 93 - 101
[25] Discriminative Scoring for Speaker Recognition Based on I-vectors
Wang, Jun
Wang, Dong
Zhu, Ziwei
Zheng, Thomas Fang
Soong, Frank
2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
[26] DISCRIMINATIVELY TRAINED BAYESIAN SPEAKER COMPARISON OF I-VECTORS
Borgstroem, Bengt J.
McCree, Alan
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7659 - 7662
[27] APPLYING COMPENSATION TECHNIQUES ON I-VECTORS EXTRACTED FROM SHORT-TEST UTTERANCES FOR SPEAKER VERIFICATION USING DEEP NEURAL NETWORK
Yang, Il-Ho
Heo, Hee-Soo
Yoon, Sung-Hyun
Yu, Ha-Jin
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5490 - 5494
[28] Speaker Diarization with I-Vectors from DNN Senone Posteriors
Sell, Gregory
Garcia-Romero, Daniel
McCree, Alan
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3096 - 3099
[29] Applications of UBMs and I-Vectors in EEG Subject Verification
Ward, Christian
Picone, Joseph
Obeid, Iyad
2016 38TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2016, : 748 - 751
[30] Client-wise cohort set selection by combining speaker- and phoneme-specific I-vectors for speaker verification
Ahmad, Waquar
Karnick, Harish
Hegde, Rajesh M.
MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (07) : 8273 - 8294

← 1 2 3 4 5 →