CROSS-LINGUAL TEXT-INDEPENDENT SPEAKER VERIFICATION USING UNSUPERVISED ADVERSARIAL DISCRIMINATIVE DOMAIN ADAPTATION

被引:0
|
作者
Xia, Wei [1 ]
Huang, Jing [2 ]
Hansen, John H. L. [1 ]
机构
[1] UT Dallas, Ctr Robust Speech Syst, Richardson, TX 75083 USA
[2] JD AI Res, Mountain View, CA USA
关键词
Speaker Verification; Adversarial Training; Domain Adaptation; Speaker Representation; RECOGNITION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speaker verification systems often degrade significantly when there is a language mismatch between training and testing data. Being able to improve cross-lingual speaker verification system using unlabeled data can greatly increase the robustness of the system and reduce human labeling costs. In this study, we introduce an unsupervised Adversarial Discriminative Domain Adaptation ( ADDA) method to effectively learn an asymmetric mapping that adapts the target domain encoder to the source domain, where the target domain and source domain are speech data from different languages. ADDA, together with a popular Domain Adversarial Training ( DAT) approach, are evaluated on a cross-lingual speaker verification task: the training data is in English from NIST SRE04-08, Mixer 6 and Switchboard, and the test data is in Chinese from AISHELL-I. We show that with the ADDA adaptation, Equal Error Rate ( EER) of the x-vector system decreases from 9.331% to 7.645%, relatively 18.07% reduction of EER, and 6.32% reduction from DAT as well. Further data analysis of ADDA adapted speaker embedding shows that the learned speaker embeddings can perform well on speaker classification for the target domain data, and are less dependent with respect to the shift in language.
引用
收藏
页码:5816 / 5820
页数:5
相关论文
共 50 条
  • [1] SpeakerNet for Cross-lingual Text-Independent Speaker Verification
    Habib, Hafsa
    Tauseef, Huma
    Fahiem, Muhammad Abuzar
    Farhan, Saima
    Usman, Ghousia
    [J]. ARCHIVES OF ACOUSTICS, 2020, 45 (04) : 573 - 583
  • [2] Collaborative and adversarial network for text-independent speaker verification in domain adaptation
    Qiang, Junhao
    Yang, Qun
    Gao, Jie
    Liu, Shaohan
    [J]. ELECTRONICS LETTERS, 2023, 59 (02)
  • [3] Discriminative transformation for sufficient adaptation in text-independent speaker verification
    Yang, Hao
    Dong, Yuan
    Zhao, Xianyu
    Zha, Jian
    Wang, Haila
    [J]. CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 558 - +
  • [4] Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification
    Shum, Stephen
    Dehak, Najim
    Dehak, Reda
    Glass, James R.
    [J]. ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 76 - 82
  • [5] Supervised domain adaptation for text-independent speaker verification using limited data
    Sarfjoo, Seyyed Saeed
    Madikeri, Srikanth
    Motlicek, Petr
    Marcel, Sebastien
    [J]. INTERSPEECH 2020, 2020, : 3815 - 3819
  • [6] Cross-domain Adaptation with Discrepancy Minimization for Text-independent Forensic Speaker Verification
    Wang, Zhenyu
    Xia, Wei
    Hansen, John H. L.
    [J]. INTERSPEECH 2020, 2020, : 2257 - 2261
  • [7] Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition
    Latif, Siddique
    Qadir, Junaid
    Bilal, Muhammad
    [J]. 2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
  • [8] Cross-lingual Speaker Adaptation using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis
    Xin, Detai
    Saito, Yuki
    Takamichi, Shinnosuke
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    [J]. INTERSPEECH 2021, 2021, : 1614 - 1618
  • [9] Cross-lingual speaker adaptation using domain adaptation and speaker consistency loss for text-to-speech synthesis
    Xin, Detai
    Saito, Yuki
    Takamichi, Shinnosuke
    Koriyama, Tomoki
    Saruwatari, Hiroshi
    [J]. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2021, 5 : 3376 - 3380
  • [10] Toward Text-independent Cross-lingual Speaker Recognition Using English-Mandarin-Taiwanese Dataset
    Wu, Yi-Chieh
    Liao, Wen-Hung
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8515 - 8522