TEXT-INDEPENDENT SPEAKER VERIFICATION WITH ADVERSARIAL LEARNING ON SHORT UTTERANCES

被引:0
|
作者
Liu, Kai [1 ]
Zhou, Huan [1 ]
机构
[1] Huawei Technol, Artificial Intelligence Applicat Res Ctr, Shenzhen, Peoples R China
关键词
speaker embedding; speaker verification; generative adversarial network;
D O I
10.1109/icassp40776.2020.9054036
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A text-independent speaker verification system suffers severe performance degradation under short utterance condition. To address the problem, in this paper, we propose an adversarially learned embedding mapping model that directly maps a short embedding to an enhanced embedding with increased discriminability. In particular, a Wasserstein GAN with a bunch of loss criteria are investigated. These loss functions have distinct optimization objectives and some of them are less favoured for the speaker verification research area. Different from most prior studies, our main objective in this study is to investigate the effectiveness of those loss criteria by conducting numerous ablation studies. Experiments on Voxceleb dataset showed that some criteria are beneficial to the verification performance while some have trivial effects. Lastly, a Wasserstein GAN with chosen loss criteria, without fine-tuning, achieves meaningful advancements over the baseline, with 4% relative improvements on EER and 7% on minDCF in the challenging scenario of short 2second utterances.
引用
收藏
页码:6569 / 6573
页数:5
相关论文
共 50 条
  • [41] Text-Independent Speaker Verification Based on Triplet Loss
    He, Junjie
    He, Jing
    Zhu, Liangjin
    [J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 2385 - 2388
  • [42] Score normalization for text-independent speaker verification systems
    Auckenthaler, R
    Carey, M
    Lloyd-Thomas, H
    [J]. DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) : 42 - 54
  • [43] CNN WITH PHONETIC ATTENTION FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Zhou, Tianyan
    Zhao, Yong
    Li, Jinyu
    Gong, Yifan
    Wu, Jian
    [J]. 2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 718 - 725
  • [44] Text-Independent Speaker Verification with Dual Attention Network
    Li, Jingyu
    Lee, Tan
    [J]. INTERSPEECH 2020, 2020, : 956 - 960
  • [45] Influence of task duration in text-independent speaker verification
    Fauve, Benoit
    Evans, Nicholas
    Pearson, Neil
    Bonastre, Jean-Francois
    Mason, John
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2728 - +
  • [46] Local Variability Vector for Text-Independent Speaker Verification
    Chen, Liping
    Lee, Kong Aik
    Ma, Bin
    Guo, Wu
    Li, Haizhou
    Dai, Li Rong
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 54 - +
  • [47] A robust sequential test for text-independent speaker verification
    Lund, MA
    Lee, CC
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 99 (01): : 609 - 621
  • [48] Exploration of Local Variability in Text-Independent Speaker Verification
    Chen, Liping
    Lee, Kong Aik
    Ma, Bin
    Guo, Wu
    Li, Haizhou
    Dai, Li-Rong
    [J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2016, 82 (02): : 217 - 228
  • [49] DEEP SPEAKER EMBEDDING LEARNING WITH MULTI-LEVEL POOLING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Tang, Yun
    Ding, Guohong
    Huang, Jing
    He, Xiaodong
    Zhou, Bowen
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6116 - 6120
  • [50] A New Score Normalization for Text-Independent Speaker Verification
    Ning, Hongke
    Zou, Y. X.
    Hu, Xuyan
    [J]. 2014 19TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2014, : 636 - 639