S-VECTOR: A DISCRIMINATIVE REPRESENTATION DERIVED FROM I-VECTOR FOR SPEAKER VERIFICATION

被引:0
|
作者
Isik, Yusuf Ziya [1 ,2 ]
Erdogan, Hakan [2 ]
Sarikaya, Ruhi [3 ]
机构
[1] TUBITAK BILGEM, Gebze, Turkey
[2] Sabanci Univ, Fac Engn & Nat Sci, Istanbul, Turkey
[3] Microsoft Corp, Redmond, WA 98052 USA
关键词
speaker verification; denoising autoencoder; random dropout;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Representing data in ways to disentangle and factor out hidden dependencies is a critical step in speaker recognition systems. In this work, we employ deep neural networks (DNN) as a feature extractor to disentangle and emphasize the speaker factors from other sources of variability in the commonly used i-vector features. Denoising autoencoder based unsupervised pre-training, random dropout fine-tuning, and Nesterov accelerated gradient based momentum is used in DNN training. Replacing the i-vectors with the resulting speaker vectors (s-vectors), we obtain superior results on NIST SRE corpora on a wide range of operating points using probabilistic linear discriminant analysis (PLDA) back-end.
引用
收藏
页码:2097 / 2101
页数:5
相关论文
共 50 条
  • [1] Pairwise Discriminative Speaker Verification in the I-Vector Space
    Cumani, Sandro
    Bruemmer, Niko
    Burget, Lukas
    Laface, Pietro
    Plchot, Oldrich
    Vasilakakis, Vasileios
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (06): : 1217 - 1227
  • [2] FAST DISCRIMINATIVE SPEAKER VERIFICATION IN THE I-VECTOR SPACE
    Cumani, Sandro
    Bruemmer, Niko
    Burget, Lukas
    Laface, Pietro
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4852 - 4855
  • [3] i-Vector with sparse representation classification for speaker verification
    Kua, Jia Min Karen
    Epps, Julien
    Ambikairajah, Eliathamby
    [J]. SPEECH COMMUNICATION, 2013, 55 (05) : 707 - 720
  • [4] An I-Vector Backend for Speaker Verification
    Kenny, Patrick
    Stafylakis, Themos
    Alam, Jahangir
    Kockmann, Marcel
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2307 - 2311
  • [5] Best Feature Selection for Emotional Speaker Verification in i-vector Representation
    Mackova, Lenka
    Cizmar, Anton
    Juhar, Jozef
    [J]. 2015 25TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA (RADIOELEKTRONIKA), 2015, : 209 - 212
  • [6] Improved i-Vector Representation for Speaker Diarization
    Xu, Yan
    McLoughlin, Ian
    Song, Yan
    Wu, Kui
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2016, 35 (09) : 3393 - 3404
  • [7] Improved i-Vector Representation for Speaker Diarization
    Yan Xu
    Ian McLoughlin
    Yan Song
    Kui Wu
    [J]. Circuits, Systems, and Signal Processing, 2016, 35 : 3393 - 3404
  • [8] DEEP NEURAL NETWORK BASED DISCRIMINATIVE TRAINING FOR I-VECTOR/PLDA SPEAKER VERIFICATION
    Zheng Tieran
    Han Jiqing
    Zheng Guibin
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5354 - 5358
  • [9] TELEPHONY TEXT-PROMPTED SPEAKER VERIFICATION USING I-VECTOR REPRESENTATION
    Zeinali, Hossein
    Kalantari, Elaheh
    Sameti, Hossein
    Hadian, Hossein
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4839 - 4843
  • [10] Joint Speaker Verification and Antispoofing in the i-Vector Space
    Sizov, Aleksandr
    Khoury, Elie
    Kinnunen, Tomi
    Wu, Zhizheng
    Marcel, Sebastien
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2015, 10 (04) : 821 - 832