SPEAKER VERIFICATION USING END-TO-END ADVERSARIAL LANGUAGE ADAPTATION

被引:0
|
作者
Rohdin, Johan [1 ]
Stafylakis, Themos [2 ]
Silnova, Anna [1 ]
Zeinali, Hossein [1 ]
Burget, Lukas [1 ]
Plchot, Oldrich [1 ]
机构
[1] Brno Univ Technol, Fac Informat Technol, IT4I Ctr Excellence, Brno, Czech Republic
[2] Omilia Conversat Intelligence, Athens, Greece
关键词
Speaker recognition; domain adaptation;
D O I
10.1109/icassp.2019.8683616
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we investigate the use of adversarial domain adaptation for addressing the problem of language mismatch between speaker recognition corpora. In the context of speaker verification, adversarial domain adaptation methods aim at minimizing certain divergences between the distribution that the utterance-level features follow ( i. e. speaker embeddings) when drawn from source and target domains ( i. e. languages), while preserving their capacity in recognizing speakers. Neural architectures for extracting utterance-level representations enable us to apply adversarial adaptation methods in an end-to-end fashion and train the network jointly with the standard cross-entropy loss. We examine several configurations, such as the use of ( pseudo-)labels on the target domain as well as domain labels in the feature extractor, and we demonstrate the effectiveness of our method on the challenging NIST SRE16 and SRE18 benchmarks.
引用
收藏
页码:6006 / 6010
页数:5
相关论文
共 50 条
  • [31] END-TO-END ATTENTION BASED TEXT-DEPENDENT SPEAKER VERIFICATION
    Zhang, Shi-Xiong
    Chen, Zhuo
    Zhao, Yong
    Li, Jinyu
    Gong, Yifan
    2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 171 - 178
  • [32] Speaker Adaptation for Attention-Based End-to-End Speech Recognition
    Meng, Zhong
    Gaur, Yashesh
    Li, Jinyu
    Gong, Yifan
    INTERSPEECH 2019, 2019, : 241 - 245
  • [33] Dynamic Speaker Representations Adjustment and Decoder Factorization for Speaker Adaptation in End-to-End Speech Synthesis
    Fu, Ruibo
    Tao, Jianhua
    Wen, Zhengqi
    Yi, Jiangyan
    Wang, Tao
    Qiang, Chunyu
    INTERSPEECH 2020, 2020, : 4701 - 4705
  • [34] TDMF: TASK-DRIVEN MULTILEVEL FRAMEWORK FOR END-TO-END SPEAKER VERIFICATION
    Chen, Chen
    Han, Jiqing
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6809 - 6813
  • [35] END-TO-END TEXT-INDEPENDENT SPEAKER VERIFICATION WITH FLEXIBILITY IN UTTERANCE DURATION
    Zhang, Chunlei
    Koishida, Kazuhito
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 584 - 590
  • [36] A High-Performance Neural Network SoC for End-to-End Speaker Verification
    Tsai, Tsung-Han
    Chiang, Meng-Jui
    IEEE ACCESS, 2024, 12 : 165482 - 165496
  • [37] A COMPLETE END-TO-END SPEAKER VERIFICATION SYSTEM USING DEEP NEURAL NETWORKS: FROM RAW SIGNALS TO VERIFICATION RESULT
    Jung, Jee-Weon
    Heo, Hee-Soo
    Yang, Il-Ho
    Shim, Hye-Jin
    Yu, Ha-Jin
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5349 - 5353
  • [38] CONTINUAL SELF-SUPERVISED DOMAIN ADAPTATION FOR END-TO-END SPEAKER DIARIZATION
    Coria, Juan M.
    Bredin, Herve
    Ghannay, Sahar
    Rosset, Sophie
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 626 - 632
  • [39] Evaluation of Feature-Space Speaker Adaptation for End-to-End Acoustic Models
    Tomashenko, Natalia
    Esteve, Yannick
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3163 - 3170
  • [40] End-to-end losses based on speaker basis vectors and all-speaker hard negative mining for speaker verification
    Heo, Hee-Soo
    Jung, Jee-weon
    Yang, IL-Ho
    Yoon, Sung-Hyun
    Shim, Hye-jin
    Yu, Ha-Jin
    INTERSPEECH 2019, 2019, : 4035 - 4039