Co-whitening of i-vectors for short and long duration speaker verification

被引:0
|
作者
Xu, Longting [1 ]
Lee, Kong Aik [2 ]
Li, Haizhou [1 ]
Yang, Zhen [3 ]
机构
[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore, Singapore
[2] NEC Corp Ltd, Data Sci Res Labs, Tokyo, Japan
[3] Nanjing Univ Posts & Telecommun, Broadband Wireless Commun & Sensor Network Techno, Nanjing, Jiangsu, Peoples R China
关键词
Speaker recognition; co-whitening; short duration; i-vector; text-independent; canonical correlation analysis;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An i-vector is a fixed-length and low-rank representation of a speech utterance. It has been used extensively in text independent speaker verification. Ideally, speech utterances from the same speaker would map to an unique i-vector. However, this is not the case due to some intrinsic and extrinsic factors like physical condition of the speaker, channel difference, noise and notably the duration of speech utterances. In particular, we found that i-vectors extracted from short utterances exhibit larger variance than that of long utterances. To address the problem, we propose a co-whitening approach, taking into account the duration, while maximizing the correlation between the i-vectors of short and long duration. The proposed co-whitening method was derived based on canonical correlation analysis (CCA). Experimental results on NIST SRE 2010 show that co-whitening method is effective in compensating the duration mismatch, leading to a reduction of up to 13.07% in equal error rate (EER).
引用
收藏
页码:1066 / 1070
页数:5
相关论文
共 50 条
  • [21] Probabilistic approach using joint long and short session i-vectors modeling to deal with short utterances for speaker recognition
    Ben Kheder, Waad
    Matrouf, Driss
    Ajili, Moez
    Bonastre, Jean-Francois
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1830 - 1834
  • [22] I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicry
    Hautamaki, Rosa Gonzalez
    Kinnunen, Tomi
    Hautamaki, Ville
    Leino, Timo
    Laukkanen, Anne-Maria
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 930 - 934
  • [23] Speaker age classification and regression using i-vectors
    Grzybowska, Joanna
    Kacprzak, Stanislaw
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1402 - 1406
  • [24] From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification
    Rajan, Padmanabhan
    Afanasyev, Anton
    Hatitamaki, Ville
    Kinnunen, Tomi
    DIGITAL SIGNAL PROCESSING, 2014, 31 : 93 - 101
  • [25] Discriminative Scoring for Speaker Recognition Based on I-vectors
    Wang, Jun
    Wang, Dong
    Zhu, Ziwei
    Zheng, Thomas Fang
    Soong, Frank
    2014 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2014,
  • [26] DISCRIMINATIVELY TRAINED BAYESIAN SPEAKER COMPARISON OF I-VECTORS
    Borgstroem, Bengt J.
    McCree, Alan
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7659 - 7662
  • [27] APPLYING COMPENSATION TECHNIQUES ON I-VECTORS EXTRACTED FROM SHORT-TEST UTTERANCES FOR SPEAKER VERIFICATION USING DEEP NEURAL NETWORK
    Yang, Il-Ho
    Heo, Hee-Soo
    Yoon, Sung-Hyun
    Yu, Ha-Jin
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5490 - 5494
  • [28] Speaker Diarization with I-Vectors from DNN Senone Posteriors
    Sell, Gregory
    Garcia-Romero, Daniel
    McCree, Alan
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3096 - 3099
  • [29] Applications of UBMs and I-Vectors in EEG Subject Verification
    Ward, Christian
    Picone, Joseph
    Obeid, Iyad
    2016 38TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2016, : 748 - 751
  • [30] Client-wise cohort set selection by combining speaker- and phoneme-specific I-vectors for speaker verification
    Ahmad, Waquar
    Karnick, Harish
    Hegde, Rajesh M.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (07) : 8273 - 8294