SPEAKER VERIFICATION USING END-TO-END ADVERSARIAL LANGUAGE ADAPTATION

被引:0
|
作者
Rohdin, Johan [1 ]
Stafylakis, Themos [2 ]
Silnova, Anna [1 ]
Zeinali, Hossein [1 ]
Burget, Lukas [1 ]
Plchot, Oldrich [1 ]
机构
[1] Brno Univ Technol, Fac Informat Technol, IT4I Ctr Excellence, Brno, Czech Republic
[2] Omilia Conversat Intelligence, Athens, Greece
关键词
Speaker recognition; domain adaptation;
D O I
10.1109/icassp.2019.8683616
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we investigate the use of adversarial domain adaptation for addressing the problem of language mismatch between speaker recognition corpora. In the context of speaker verification, adversarial domain adaptation methods aim at minimizing certain divergences between the distribution that the utterance-level features follow ( i. e. speaker embeddings) when drawn from source and target domains ( i. e. languages), while preserving their capacity in recognizing speakers. Neural architectures for extracting utterance-level representations enable us to apply adversarial adaptation methods in an end-to-end fashion and train the network jointly with the standard cross-entropy loss. We examine several configurations, such as the use of ( pseudo-)labels on the target domain as well as domain labels in the feature extractor, and we demonstrate the effectiveness of our method on the challenging NIST SRE16 and SRE18 benchmarks.
引用
收藏
页码:6006 / 6010
页数:5
相关论文
共 50 条
  • [11] Contrastive Learning for improving End-to-end Speaker Verification
    Tang, Yanxi
    Wang, Jianzong
    Qu, Xiaoyang
    Xiao, Jing
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [12] Angular Softmax Loss for End-to-end Speaker Verification
    Li, Yutian
    Gao, Feng
    Ou, Zhijian
    Sun, Jiasong
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 190 - 194
  • [13] End-to-End Text-Dependent Speaker Verification
    Heigold, Georg
    Moreno, Ignacio
    Bengio, Samy
    Shazeer, Noam
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5115 - 5119
  • [14] Neural PLDA Modeling for End-to-End Speaker Verification
    Ramoji, Shreyas
    Krishnan, Prashant
    Ganapathy, Sriram
    INTERSPEECH 2020, 2020, : 4333 - 4337
  • [15] Speaker-Smoothed kNN Speaker Adaptation for End-to-End ASR
    Li, Shaojun
    Wei, Daimeng
    Shang, Hengchao
    Guo, Jiaxin
    Li, Zongyao
    Wu, Zhanglin
    Rao, Zhiqiang
    Luo, Yuanchang
    He, Xianghui
    Yang, Hao
    INTERSPEECH 2024, 2024, : 2390 - 2394
  • [16] SPEAKER ADAPTATION FOR MULTICHANNEL END-TO-END SPEECH RECOGNITION
    Ochiai, Tsubasa
    Watanabe, Shinji
    Katagiri, Shigeru
    Hori, Takaaki
    Hershey, John
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6707 - 6711
  • [17] UNSUPERVISED SPEAKER ADAPTATION USING ATTENTION-BASED SPEAKER MEMORY FOR END-TO-END ASR
    Sari, Leda
    Moritz, Niko
    Hori, Takaaki
    Le Roux, Jonathan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7384 - 7388
  • [18] SPEAKER AND LANGUAGE AWARE TRAINING FOR END-TO-END ASR
    Bansal, Shubham
    Malhotra, Karan
    Ganapathy, Sriram
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 494 - 501
  • [19] Improved Relation Networks for End-to-End Speaker Verification and Identification
    Chaubey, Ashutosh
    Sinha, Sparsh
    Ghose, Susmita
    INTERSPEECH 2022, 2022, : 5085 - 5089
  • [20] Strategies for End-to-End Text-Independent Speaker Verification
    Lin, Weiwei
    Mak, Man-Wai
    Chien, Jen-Tzung
    INTERSPEECH 2020, 2020, : 4308 - 4312