I-vector based speaker recognition using advanced channel compensation techniques

被引:24
|
作者
Kanagasundaram, Ahilan [1 ]
Dean, David [1 ]
Sridharan, Sridha [1 ]
McLaren, Mitchell [2 ]
Vogt, Robbie [1 ]
机构
[1] Queensland Univ Technol, SAIVT, Speech & Audio Res Lab, Brisbane, Qld 4001, Australia
[2] Radboud Univ Nijmegen, Ctr Language & Speech Technol, NL-6525 ED Nijmegen, Netherlands
来源
COMPUTER SPEECH AND LANGUAGE | 2014年 / 28卷 / 01期
基金
澳大利亚研究理事会;
关键词
Speaker verification; I-vector; GPLDA; LDA; SN-LDA; WLDA; SN-WLDA; LINEAR DISCRIMINANT-ANALYSIS;
D O I
10.1016/j.csl.2013.04.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates advanced channel compensation techniques for the purpose of improving i-vector speaker verification performance in the presence of high intersession variability using the NIST 2008 and 2010 SRE corpora. The performance of four channel compensation techniques: (a) weighted maximum margin criterion (WMMC), (b) source-normalized WMMC (SN-WMMC), (c) weighted linear discriminant analysis (WLDA) and (d) source-normalized WLDA (SN-WLDA) have been investigated. We show that, by extracting the discriminatory information between pairs of speakers as well as capturing the source variation information in the development i-vector space, the SN-WLDA based cosine similarity scoring (CSS) i-vector system is shown to provide over 20% improvement in EER for NIST 2008 interview and microphone verification and over 10% improvement in EER for NIST 2008 telephone verification, when compared to SN-LDA based CSS i-vector system. Further, score-level fusion techniques are analyzed to combine the best channel compensation approaches, to provide over 8% improvement in DCF over the best single approach, SN-WLDA, for NIST 2008 interview/telephone enrolment-verification condition. Finally, we demonstrate that the improvements found in the context of CSS also generalize to state-of-the-art GPLDA with up to 14% relative improvement in EER for NIST SRE 2010 interview and microphone verification and over 7% relative improvement in EER for NIST SRE 2010 telephone verification. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:121 / 140
页数:20
相关论文
共 50 条
  • [31] GENDER INDEPENDENT DISCRIMINATIVE SPEAKER RECOGNITION IN I-VECTOR SPACE
    Cumani, Sandro
    Glembek, Ondrej
    Bruemmer, Niko
    de Villiers, Edward
    Laface, Pietro
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4361 - 4364
  • [32] Speaker recognition based on discriminant i-vector local distance preserving projection
    Li, Zhiyi
    He, Liang
    Zhang, Weiqiang
    Liu, Jia
    [J]. Qinghua Daxue Xuebao/Journal of Tsinghua University, 2012, 52 (05): : 598 - 601
  • [33] DEALING WITH ADDITIVE NOISE IN SPEAKER RECOGNITION SYSTEMS BASED ON I-VECTOR APPROACH
    Matrouf, D.
    Ben Kheder, W.
    Bousquet, P-M.
    Ajili, M.
    Bonastre, J-F.
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2092 - 2096
  • [34] AN IMPROVED UNCERTAINTY PROPAGATION METHOD FOR ROBUST I-VECTOR BASED SPEAKER RECOGNITION
    Ribas, Dayana
    Vincent, Emmanuel
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6331 - 6335
  • [35] Analysis of I-vector Length Normalization in Speaker Recognition Systems
    Garcia-Romero, Daniel
    Espy-Wilson, Carol Y.
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 256 - 259
  • [36] Noise Compensation in i-vector Space Using Linear Regression for Robust Speaker Verification
    Baby, Renjith
    Kumar, C. Santhosh
    George, Kuruvachan K.
    Panda, Ashish
    [J]. PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON MULTIMEDIA, SIGNAL PROCESSING AND COMMUNICATION TECHNOLOGIES (IMPACT), 2017, : 161 - 165
  • [37] An Advanced Channel Compensation Method for Speaker Recognition
    Imamverdiyev, Yadigar
    Sukhostat, Lyudmila
    [J]. 2013 7TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2013, : 128 - 131
  • [38] Evaluation of i-vector Speaker Recognition Systems for Forensic Application
    Mandasari, Miranti Indar
    McLaren, Mitchell
    van Leeuwen, David
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 28 - 31
  • [39] Tied Variational Autoencoder Backends for i-Vector Speaker Recognition
    Villalba, Jesus
    Brummer, Niko
    Dehak, Najim
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1004 - 1008
  • [40] Factorization of Discriminatively Trained i-vector Extractor for Speaker Recognition
    Novotny, Ondrej
    Plchot, Oldrich
    Glembek, Ondrej
    Burget, Lukas
    [J]. INTERSPEECH 2019, 2019, : 4330 - 4334