I-vector based speaker recognition using advanced channel compensation techniques

被引:24
|
作者
Kanagasundaram, Ahilan [1 ]
Dean, David [1 ]
Sridharan, Sridha [1 ]
McLaren, Mitchell [2 ]
Vogt, Robbie [1 ]
机构
[1] Queensland Univ Technol, SAIVT, Speech & Audio Res Lab, Brisbane, Qld 4001, Australia
[2] Radboud Univ Nijmegen, Ctr Language & Speech Technol, NL-6525 ED Nijmegen, Netherlands
来源
COMPUTER SPEECH AND LANGUAGE | 2014年 / 28卷 / 01期
基金
澳大利亚研究理事会;
关键词
Speaker verification; I-vector; GPLDA; LDA; SN-LDA; WLDA; SN-WLDA; LINEAR DISCRIMINANT-ANALYSIS;
D O I
10.1016/j.csl.2013.04.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates advanced channel compensation techniques for the purpose of improving i-vector speaker verification performance in the presence of high intersession variability using the NIST 2008 and 2010 SRE corpora. The performance of four channel compensation techniques: (a) weighted maximum margin criterion (WMMC), (b) source-normalized WMMC (SN-WMMC), (c) weighted linear discriminant analysis (WLDA) and (d) source-normalized WLDA (SN-WLDA) have been investigated. We show that, by extracting the discriminatory information between pairs of speakers as well as capturing the source variation information in the development i-vector space, the SN-WLDA based cosine similarity scoring (CSS) i-vector system is shown to provide over 20% improvement in EER for NIST 2008 interview and microphone verification and over 10% improvement in EER for NIST 2008 telephone verification, when compared to SN-LDA based CSS i-vector system. Further, score-level fusion techniques are analyzed to combine the best channel compensation approaches, to provide over 8% improvement in DCF over the best single approach, SN-WLDA, for NIST 2008 interview/telephone enrolment-verification condition. Finally, we demonstrate that the improvements found in the context of CSS also generalize to state-of-the-art GPLDA with up to 14% relative improvement in EER for NIST SRE 2010 interview and microphone verification and over 7% relative improvement in EER for NIST SRE 2010 telephone verification. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:121 / 140
页数:20
相关论文
共 50 条
  • [1] Neural Networks based Channel Compensation for I-Vector Speaker Verification
    Rao, Wei
    Xiao, Xiong
    Xu, Chenglin
    Xu, Haihua
    Lee, Kong Aik
    Chng, Eng Siong
    Li, Haizhou
    [J]. 2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [2] DURATION MISMATCH COMPENSATION FOR I-VECTOR BASED SPEAKER RECOGNITION SYSTEMS
    Hasan, Taufiq
    Saeidi, Rahim
    Hansen, John H. L.
    van Leeuwen, David A.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7663 - 7667
  • [3] ADDITIVE NOISE COMPENSATION IN THE I-VECTOR SPACE FOR SPEAKER RECOGNITION
    Ben Kheder, Waad
    Matrouf, Driss
    Bonastre, Jean-Francois
    Ajili, Moez
    Bousquet, Pierre-Michel
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4190 - 4194
  • [4] I-vector Based Speaker Gender Recognition
    Wang, Minghe
    Chen, Ying
    Tang, Zhenmin
    Zhang, Erhua
    [J]. 2015 IEEE ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2015, : 729 - 732
  • [5] i-vector Based Speaker Recognition on Short Utterances
    Kanagasundaram, Ahilan
    Vogt, Robbie
    Dean, David
    Sridharan, Sridha
    Mason, Michael
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2352 - +
  • [6] Sparsity Analysis and Compensation for i-Vector Based Speaker Verification
    Li, Wei
    Fu, Tian Fan
    Zhu, Jie
    Chen, Ning
    [J]. SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 : 381 - 388
  • [7] Front-end Channel Compensation using Mixture-dependent Feature Transformations for i-Vector Speaker Recognition
    Hasan, Taufiq
    Hansen, John H. L.
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1090 - 1093
  • [8] A Comparison of Covariance Matrix and i-vector Based Speaker Recognition
    Jakovljevic, Niksa
    Jokic, Ivan
    Josic, Slobodan
    Delic, Vlado
    [J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 37 - 45
  • [9] DEEP BELIEF NETWORKS FOR I-VECTOR BASED SPEAKER RECOGNITION
    Ghahabi, Omid
    Hernando, Javier
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [10] I-vector Extraction for Speaker Recognition Based on Dimensionality Reduction
    Ibrahim, Noor Salwani
    Ramli, Dzati Athiar
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KES-2018), 2018, 126 : 1534 - 1540