NEAREST NEIGHBOR BASED I-VECTOR NORMALIZATION FOR ROBUST SPEAKER RECOGNITION UNDER UNSEEN CHANNEL CONDITIONS

被引:0
|
作者
Zhu, Weizhong [1 ]
Sadjadi, Seyed Omid [1 ]
Pelecanos, Jason W. [1 ]
机构
[1] IBM Res, Watson Grp, Yorktown Hts, NY 10598 USA
关键词
i-vector; nearest neighbor; PLDA; speaker recognition; unsupervised adaptation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Many state-of-the-art speaker recognition engines use i-vectors to represent variable-length acoustic signals in a fixed low-dimensional total variability subspace. While such systems perform well under seen channel conditions, their performance greatly degrades under unseen channel scenarios. Accordingly, rapid adaptation of i-vector systems to unseen conditions has recently attracted significant research effort from the community. To mitigate this mismatch, in this paper we propose nearest neighbor based i-vector mean normalization (NN-IMN) and i-vector smoothing (IS) for unsupervised adaptation to unseen channel conditions within a state-of-the-art i-vector/PLDA speaker verification framework. A major advantage of the approach is its ability to handle multiple unseen channels without explicit retraining or clustering. Our observations on the DARPA Robust Automatic Transcription of Speech (RATS) speaker recognition task suggest that part of the distortion caused by an unseen channel may be modeled as an offset in the i-vector space. Hence, the proposed nearest neighbor based normalization technique is formulated to compensate for such a shift. Experimental results with the NN based normalized i-vectors indicate that, on average, we can recover 46% of the total performance degradation due to unseen channel conditions.
引用
收藏
页码:4684 / 4688
页数:5
相关论文
共 50 条
  • [1] Analysis of I-vector Length Normalization in Speaker Recognition Systems
    Garcia-Romero, Daniel
    Espy-Wilson, Carol Y.
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 256 - 259
  • [2] Full multicondition training for robust i-vector based speaker recognition
    Ribas, Dayana
    Vincent, Emmanuel
    Ramon Calvo, Jose
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1057 - 1061
  • [3] Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition
    Wang, Shuai
    Huang, Zili
    Qian, Yanmin
    Yu, Kai
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 195 - 199
  • [4] I-vector Based Speaker Gender Recognition
    Wang, Minghe
    Chen, Ying
    Tang, Zhenmin
    Zhang, Erhua
    [J]. 2015 IEEE ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2015, : 729 - 732
  • [5] AN IMPROVED UNCERTAINTY PROPAGATION METHOD FOR ROBUST I-VECTOR BASED SPEAKER RECOGNITION
    Ribas, Dayana
    Vincent, Emmanuel
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6331 - 6335
  • [6] I-vector based speaker recognition using advanced channel compensation techniques
    Kanagasundaram, Ahilan
    Dean, David
    Sridharan, Sridha
    McLaren, Mitchell
    Vogt, Robbie
    [J]. COMPUTER SPEECH AND LANGUAGE, 2014, 28 (01): : 121 - 140
  • [7] i-vector Based Speaker Recognition on Short Utterances
    Kanagasundaram, Ahilan
    Vogt, Robbie
    Dean, David
    Sridharan, Sridha
    Mason, Michael
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2352 - +
  • [8] A NOISE ROBUST I-VECTOR EXTRACTOR USING VECTOR TAYLOR SERIES FOR SPEAKER RECOGNITION
    Lei, Yun
    Burget, Lukas
    Scheffer, Nicolas
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6788 - 6791
  • [9] A Comparison of Covariance Matrix and i-vector Based Speaker Recognition
    Jakovljevic, Niksa
    Jokic, Ivan
    Josic, Slobodan
    Delic, Vlado
    [J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 37 - 45
  • [10] DEEP BELIEF NETWORKS FOR I-VECTOR BASED SPEAKER RECOGNITION
    Ghahabi, Omid
    Hernando, Javier
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,