THE CORAL plus plus ALGORITHM FOR UNSUPERVISED DOMAIN ADAPTATION OF SPEAKER RECOGNITION

被引:8
|
作者
Li, Rongjin [1 ]
Zhang, Weibin [1 ]
Chen, Dongpeng [1 ]
机构
[1] VoiceAI Technol Co Ltd, Shenzhen, Peoples R China
关键词
Speaker recognition; speaker embedding; domain adaptation; unsupervised learning;
D O I
10.1109/ICASSP43922.2022.9747792
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
State-of-the-art speaker recognition systems are trained with a large amount of human-labeled training data set. Such a training set is usually composed of various data sources to enhance the modeling capability of models. However, in practical deployment, unseen condition is almost inevitable. Domain mismatch is a common problem in real-life applications due to the statistical difference between the training and testing data sets. To alleviate the degradation caused by domain mismatch, we propose a new feature-based unsupervised domain adaptation algorithm. The algorithm we propose is a further optimization based on the well-known CORrelation ALignment (CORAL), so we call it CORAL++. On the NIST 2019 Speaker Recognition Evaluation (SRE19), we use SRE18 CTS set as the development set to verify the effectiveness of CORAL++. With the typical x-vector/PLDA setup, the CORAL++ outperforms the CORAL by 9.40% relatively on EER.
引用
收藏
页码:7172 / 7176
页数:5
相关论文
共 50 条
  • [21] DOMAIN ADAPTATION FOR SPEAKER RECOGNITION IN SINGING AND SPOKEN VOICE
    Chowdhury, Anurag
    Cozzo, Austin
    Ross, Arun
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7192 - 7196
  • [22] A GENERALIZED FRAMEWORK FOR DOMAIN ADAPTATION OF PLDA IN SPEAKER RECOGNITION
    Wang, Qiongqiong
    Okabe, Koji
    Lee, Kong Aik
    Koshinaka, Takafumi
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6619 - 6623
  • [23] Contrastive Adversarial Domain Adaptation Networks for Speaker Recognition
    Li, Longxin
    Mak, Man-Wai
    Chien, Jen-Tzung
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (05) : 2236 - 2245
  • [24] A speaker clustering algorithm for fast speaker adaptation in continuous speech recognition
    Rodríguez, LJ
    Torres, MI
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 433 - 440
  • [25] An Unsupervised Domain Adaptation Method Based on Distribution Alignment for Speaker Verification
    Gu, Qing
    Song, Yan
    Guo, Wu
    Ye, Zhongfu
    Dai, Lirong
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 359 - 369
  • [26] Cluster-Guided Unsupervised Domain Adaptation for Deep Speaker Embedding
    Mao, Haiquan
    Hong, Feng
    Mak, Man-wai
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 643 - 647
  • [27] Multimodal Unsupervised Domain Adaptation for Predicting Speaker Characteristics from Video
    Thomas C.
    Udhayanan P.
    Yadav A.
    Purvaj S.
    Jayagopi D.B.
    SN Computer Science, 5 (5)
  • [28] Prototype and Instance Contrastive Learning for Unsupervised Domain Adaptation in Speaker Verification
    Huang, Wen
    Han, Bing
    Chen, Zhengyang
    Wang, Shuai
    Qian, Yanmin
    2024 IEEE 14TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, ISCSLP 2024, 2024, : 383 - 387
  • [29] N-Best-based unsupervised speaker adaptation for speech recognition
    Matsui, T
    Furui, S
    COMPUTER SPEECH AND LANGUAGE, 1998, 12 (01): : 41 - 50
  • [30] TMLP plus SRDANN: A domain adaptation method for EEG-based emotion recognition
    Li, Wei
    Hou, Bowen
    Li, Xiaoyu
    Qiu, Ziming
    Peng, Bo
    Tian, Ye
    MEASUREMENT, 2023, 207