THE CORAL plus plus ALGORITHM FOR UNSUPERVISED DOMAIN ADAPTATION OF SPEAKER RECOGNITION

被引：8

作者：

Li, Rongjin ^{[1
]}

Zhang, Weibin ^{[1
]}

Chen, Dongpeng ^{[1
]}

机构：

[1] VoiceAI Technol Co Ltd, Shenzhen, Peoples R China

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

关键词：

Speaker recognition; speaker embedding; domain adaptation; unsupervised learning;

D O I：

10.1109/ICASSP43922.2022.9747792

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

State-of-the-art speaker recognition systems are trained with a large amount of human-labeled training data set. Such a training set is usually composed of various data sources to enhance the modeling capability of models. However, in practical deployment, unseen condition is almost inevitable. Domain mismatch is a common problem in real-life applications due to the statistical difference between the training and testing data sets. To alleviate the degradation caused by domain mismatch, we propose a new feature-based unsupervised domain adaptation algorithm. The algorithm we propose is a further optimization based on the well-known CORrelation ALignment (CORAL), so we call it CORAL++. On the NIST 2019 Speaker Recognition Evaluation (SRE19), we use SRE18 CTS set as the development set to verify the effectiveness of CORAL++. With the typical x-vector/PLDA setup, the CORAL++ outperforms the CORAL by 9.40% relatively on EER.

引用

页码：7172 / 7176

页数：5

共 50 条

[41] SCAN plus plus : Enhanced Semantic Conditioned Adaptation for Domain Adaptive Object Detection
Li, Wuyang
Liu, Xinyu
Yuan, Yixuan
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 7051 - 7061
[42] Unsupervised speaker adaptation using reference speaker weighting
Lai, Tsz-Chung
Mak, Brian
CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 380 - +
[43] Domain Adaptation of PLDA models in Broadcast Diarization by means of Unsupervised Speaker Clustering
Vinals, Ignacio
Ortega, Alfonso
Villalba, Jesus
Miguel, Antonio
Lleida, Eduardo
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2829 - 2833
[44] Maximum-Likelihood Linear Transformation for Unsupervised Domain Adaptation in Speaker Verification
Misra, Abhinav
Hansen, John H. L.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (09) : 1549 - 1558
[45] Supervised and unsupervised speaker adaptation in large vocabulary continuous speech recognition of Czech
Cerva, P
Nouza, J
TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2005, 3658 : 203 - 210
[46] ON COMBINING DNN AND GMM WITH UNSUPERVISED SPEAKER ADAPTATION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
Liu, Shilin
Sim, Khe Chai
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[47] DIFFERENTIABLE POOLING FOR UNSUPERVISED SPEAKER ADAPTATION
Swietojanski, Pawel
Renals, Steve
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4305 - 4309
[48] Unsupervised model adaptation for speaker verification
Preti, Alexandre
Bonastre, Jean-Francois
INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2090 - 2093
[49] An approach to robust unsupervised speaker adaptation
Kim, NS
Seo, DJ
Lim, W
IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (06) : 469 - 472
[50] Unsupervised Domain Adaptation via Class Aggregation for Text Recognition
Liu, Xiao-Qian
Ding, Xue-Ying
Luo, Xin
Xu, Xin-Shun
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 5617 - 5630

← 1 2 3 4 5 →