SPEAKER VERIFICATION USING END-TO-END ADVERSARIAL LANGUAGE ADAPTATION

被引:0
|
作者
Rohdin, Johan [1 ]
Stafylakis, Themos [2 ]
Silnova, Anna [1 ]
Zeinali, Hossein [1 ]
Burget, Lukas [1 ]
Plchot, Oldrich [1 ]
机构
[1] Brno Univ Technol, Fac Informat Technol, IT4I Ctr Excellence, Brno, Czech Republic
[2] Omilia Conversat Intelligence, Athens, Greece
关键词
Speaker recognition; domain adaptation;
D O I
10.1109/icassp.2019.8683616
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we investigate the use of adversarial domain adaptation for addressing the problem of language mismatch between speaker recognition corpora. In the context of speaker verification, adversarial domain adaptation methods aim at minimizing certain divergences between the distribution that the utterance-level features follow ( i. e. speaker embeddings) when drawn from source and target domains ( i. e. languages), while preserving their capacity in recognizing speakers. Neural architectures for extracting utterance-level representations enable us to apply adversarial adaptation methods in an end-to-end fashion and train the network jointly with the standard cross-entropy loss. We examine several configurations, such as the use of ( pseudo-)labels on the target domain as well as domain labels in the feature extractor, and we demonstrate the effectiveness of our method on the challenging NIST SRE16 and SRE18 benchmarks.
引用
收藏
页码:6006 / 6010
页数:5
相关论文
共 50 条
  • [41] End-to-End Chinese Speaker Identification
    Yu, Dian
    Zhou, Ben
    Yu, Dong
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2274 - 2285
  • [42] End-to-End Active Speaker Detection
    Alcazar, Juan Leon
    Cordes, Moritz
    Zhao, Chen
    Ghanem, Bernard
    COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 126 - 143
  • [43] End-to-End Text-Independent Speaker Verification with Triplet Loss on Short Utterances
    Zhang, Chunlei
    Koishida, Kazuhito
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1487 - 1491
  • [44] Joint Training of Expanded End-to-end DNN for Text-dependent Speaker Verification
    Heo, Hee-soo
    Jung, Jee-weon
    Yang, Il-ho
    Yoon, Sung-hyun
    Yu, Ha-jin
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1532 - 1536
  • [45] End-to-End Residual CNN with L-GM Loss Speaker Verification System
    Shi, Xuan
    Du, Xingjian
    Zhu, Mengyao
    2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [46] End-to-End Neural Speaker Diarization with Absolute Speaker Loss
    Wang, Chao
    Li, Jie
    Fang, Xiang
    Kang, Jian
    Li, Yongxiang
    INTERSPEECH 2023, 2023, : 3577 - 3581
  • [47] Joint speaker encoder and neural back-end model for fully end-to-end automatic speaker verification with multiple enrollment utterances
    Zeng, Chang
    Miao, Xiaoxiao
    Wang, Xin
    Cooper, Erica
    Yamagishi, Junichi
    COMPUTER SPEECH AND LANGUAGE, 2024, 86
  • [48] SEMI-SUPERVISED SPEAKER ADAPTATION FOR END-TO-END SPEECH SYNTHESIS WITH PRETRAINED MODELS
    Inoue, Katsuki
    Hara, Sunao
    Abe, Masanobu
    Hayashi, Tomoki
    Yamamoto, Ryuichi
    Watanabe, Shinji
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7634 - 7638
  • [49] Optimization for Low-Resource Speaker Adaptation in End-to-End Text-to-Speech
    Hong, Changi
    Lee, Jung Hyuk
    Jeon, Moongu
    Kim, Hong Kook
    2024 IEEE 21ST CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE, CCNC, 2024, : 1060 - 1061
  • [50] Personality-aware Training based Speaker Adaptation for End-to-end Speech Recognition
    Gu, Yue
    Du, Zhihao
    Zhang, Shiliang
    Chen, Qian
    Han, Jiqing
    INTERSPEECH 2023, 2023, : 1249 - 1253