Unsupervised NAP Training Data Design for Speaker Recognition

被引:0
|
作者
Sun, Hanwu [1 ]
Ma, Bin [1 ]
机构
[1] ASTAR, Inst Infocomm Res I2R, Singapore 138632, Singapore
关键词
speaker recognition; speaker diarization; speaker cluster; Nuisance Attribute Projection; SPEECH;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Nuisance Attribute Projection (NAP) with labeled data provides an effective approach for improving the speaker recognition performance in the state-of-art speaker recognition system by removing unwanted speaker channel and handsets variation. However, the requirement for the labeled NAP training data may limit its practical application. In this paper, we propose an unsupervised clustering strategy to design NAP training data without a priori knowledge about channel and speaker information. A fast clustering and purifying algorithm is introduced to group the unlabeled NAP training data into speaker dependent clusters to drive the NAP training data. The GMM-SVM based speaker recognition system is adopted to evaluate the performance. The system with the unsupervised NAP training data design achieves a similar performance with that using labeled NAP training data on both SRE06 1conv-1conv all English trials and SRE08 short2-short3 Tel-Tel All English trials subtasks.
引用
收藏
页码:1098 / 1101
页数:4
相关论文
共 50 条
  • [1] Improved Unsupervised NAP Training Dataset Design for Speaker Recognition
    Sun, Hanwu
    Ma, Bin
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1990 - 1994
  • [2] Ensemble based speaker recognition using unsupervised data selection
    Huang, Chien-Lin
    Wang, Jia-Ching
    Ma, Bin
    [J]. APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2016, 5
  • [3] Training speaker recognition systems with limited data
    Vaessen, Nik
    van Leeuwen, David A.
    [J]. INTERSPEECH 2022, 2022, : 4760 - 4764
  • [4] UNSUPERVISED DOMAIN ADAPTATION VIA DOMAIN ADVERSARIAL TRAINING FOR SPEAKER RECOGNITION
    Wang, Qing
    Rao, Wei
    Sun, Sining
    Xie, Lei
    Chng, Eng Siong
    Li, Haizhou
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4889 - 4893
  • [5] Ensemble Classifiers Using Unsupervised Data Selection for Speaker Recognition
    Huang, Chien-Lin
    Hori, Chiori
    Kashioka, Hideki
    Ma, Bin
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2665 - +
  • [6] Scatter Difference NAP for SVM Speaker Recognition
    Baker, Brendan
    Vogt, Robbie
    McLaren, Mitchell
    Sridharan, Sridha
    [J]. ADVANCES IN BIOMETRICS, 2009, 5558 : 464 - 473
  • [7] SPEAKER RECOGNITION IN NOISY CONDITIONS WITH LIMITED TRAINING DATA
    McLaughlin, Niall
    Ming, Ji
    Crookes, Danny
    [J]. 19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 1294 - 1298
  • [8] UNSUPERVISED IDIOLECT DISCOVERY FOR SPEAKER RECOGNITION
    Jansen, Aren
    Garcia-Romero, Daniel
    Clark, Pascal
    Hernandez-Cordero, Jaime
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [9] SPEAKER DIARIZATION WITH UNSUPERVISED TRAINING FRAMEWORKL
    Le Lan, Gael
    Meignier, Sylvain
    Charlet, Delphine
    Deleglise, Paul
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5560 - 5564
  • [10] On robustness of unsupervised domain adaptation for speaker recognition
    Bousquet, Pierre-Michel
    Rouvier, Mickael
    [J]. INTERSPEECH 2019, 2019, : 2958 - 2962