THE CORAL plus plus ALGORITHM FOR UNSUPERVISED DOMAIN ADAPTATION OF SPEAKER RECOGNITION

被引:8
|
作者
Li, Rongjin [1 ]
Zhang, Weibin [1 ]
Chen, Dongpeng [1 ]
机构
[1] VoiceAI Technol Co Ltd, Shenzhen, Peoples R China
关键词
Speaker recognition; speaker embedding; domain adaptation; unsupervised learning;
D O I
10.1109/ICASSP43922.2022.9747792
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
State-of-the-art speaker recognition systems are trained with a large amount of human-labeled training data set. Such a training set is usually composed of various data sources to enhance the modeling capability of models. However, in practical deployment, unseen condition is almost inevitable. Domain mismatch is a common problem in real-life applications due to the statistical difference between the training and testing data sets. To alleviate the degradation caused by domain mismatch, we propose a new feature-based unsupervised domain adaptation algorithm. The algorithm we propose is a further optimization based on the well-known CORrelation ALignment (CORAL), so we call it CORAL++. On the NIST 2019 Speaker Recognition Evaluation (SRE19), we use SRE18 CTS set as the development set to verify the effectiveness of CORAL++. With the typical x-vector/PLDA setup, the CORAL++ outperforms the CORAL by 9.40% relatively on EER.
引用
收藏
页码:7172 / 7176
页数:5
相关论文
共 50 条
  • [41] SCAN plus plus : Enhanced Semantic Conditioned Adaptation for Domain Adaptive Object Detection
    Li, Wuyang
    Liu, Xinyu
    Yuan, Yixuan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 7051 - 7061
  • [42] Unsupervised speaker adaptation using reference speaker weighting
    Lai, Tsz-Chung
    Mak, Brian
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 380 - +
  • [43] Domain Adaptation of PLDA models in Broadcast Diarization by means of Unsupervised Speaker Clustering
    Vinals, Ignacio
    Ortega, Alfonso
    Villalba, Jesus
    Miguel, Antonio
    Lleida, Eduardo
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2829 - 2833
  • [44] Maximum-Likelihood Linear Transformation for Unsupervised Domain Adaptation in Speaker Verification
    Misra, Abhinav
    Hansen, John H. L.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (09) : 1549 - 1558
  • [45] Supervised and unsupervised speaker adaptation in large vocabulary continuous speech recognition of Czech
    Cerva, P
    Nouza, J
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2005, 3658 : 203 - 210
  • [46] ON COMBINING DNN AND GMM WITH UNSUPERVISED SPEAKER ADAPTATION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Liu, Shilin
    Sim, Khe Chai
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [47] DIFFERENTIABLE POOLING FOR UNSUPERVISED SPEAKER ADAPTATION
    Swietojanski, Pawel
    Renals, Steve
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4305 - 4309
  • [48] Unsupervised model adaptation for speaker verification
    Preti, Alexandre
    Bonastre, Jean-Francois
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2090 - 2093
  • [49] An approach to robust unsupervised speaker adaptation
    Kim, NS
    Seo, DJ
    Lim, W
    IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (06) : 469 - 472
  • [50] Unsupervised Domain Adaptation via Class Aggregation for Text Recognition
    Liu, Xiao-Qian
    Ding, Xue-Ying
    Luo, Xin
    Xu, Xin-Shun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 5617 - 5630