IMPROVING SPEAKER RECOGNITION PERFORMANCE IN THE DOMAIN ADAPTATION CHALLENGE USING DEEP NEURAL NETWORKS

被引:0
|
作者
Garcia-Romero, Daniel [1 ]
Zhang, Xiaohui
McCree, Alan
Povey, Daniel
机构
[1] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
关键词
Unsupervised adaptation; speaker recognition; i-vectors; deep neural networks;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional i-vector speaker recognition systems use a Gaussian mixture model (GMM) to collect sufficient statistics (SS). Recently, replacing this GMM with a deep neural network (DNN) has shown promising results. In this paper, we explore the use of DNNs to collect SS for the unsupervised domain adaptation task of the Domain Adaptation Challenge (DAC). We show that collecting SS with a DNN trained on out-of-domain data boosts the speaker recognition performance of an out-of-domain system by more than 25%. Moreover, we integrate the DNN in an unsupervised adaptation framework, that uses agglomerative hierarchical clustering with a stopping criterion based on unsupervised calibration, and show that the initial gains of the out-of-domain system carry over to the final adapted system. Despite the fact that the DNN is trained on the out-of-domain data, the final adapted system produces a relative improvement of more than 30% with respect to the best published results on this task.
引用
收藏
页码:378 / 383
页数:6
相关论文
共 50 条
  • [42] Biometric Speaker Recognition Using Neural Networks and Wavelet Transform
    Daghbosheh, Mohammed
    Hattab, Ezz
    Bisher, Ahmad
    2011 INTERNATIONAL CONFERENCE ON CIVIL ENGINEERING AND INFORMATION TECHNOLOGY (CEIT 2011), 2011, : 1 - 8
  • [43] Bi-Transferring Deep Neural Networks for Domain Adaptation
    Zhou, Guangyou
    Xie, Zhiwen
    Huang, Jimmy Xiangji
    He, Tingting
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 322 - 332
  • [44] Using neural networks for automatic speaker recognition: A practical approach
    Pinto, RGCP
    Pinto, HLCP
    Caloba, LP
    38TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, PROCEEDINGS, VOLS 1 AND 2, 1996, : 1078 - 1080
  • [45] Improving the Robustness and Adaptability of sEMG-Based Pattern Recognition Using Deep Domain Adaptation
    Shi, Ping
    Zhang, Xinran
    Li, Wei
    Yu, Hongliu
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (11) : 5450 - 5460
  • [46] Speaker recognition using Radial Basis Function neural networks
    Deng, JP
    Venkateswarlu, R
    HYBRID INFORMATION SYSTEMS, 2002, : 57 - 64
  • [47] Speaker recognition using dynamic synapse-neural networks
    George, S
    Dibazar, A
    Berger, TW
    SECOND JOINT EMBS-BMES CONFERENCE 2002, VOLS 1-3, CONFERENCE PROCEEDINGS: BIOENGINEERING - INTEGRATIVE METHODOLOGIES, NEW TECHNOLOGIES, 2002, : 151 - 152
  • [48] Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks
    Sari, Leda
    Thomas, Samuel
    Hasegawa-Johnson, Mark A.
    INTERSPEECH 2019, 2019, : 769 - 773
  • [49] Speaker Gender Recognition Based on Deep Neural Networks and ResNet50
    Alnuaim, Abeer Ali
    Zakariah, Mohammed
    Shashidhar, Chitra
    Hatamleh, Wesam Atef
    Tarazi, Hussam
    Shukla, Prashant Kumar
    Ratna, Rajnish
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [50] DOMAIN ADAPTATION FOR SPEAKER RECOGNITION IN SINGING AND SPOKEN VOICE
    Chowdhury, Anurag
    Cozzo, Austin
    Ross, Arun
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7192 - 7196