IMPROVING SPEAKER RECOGNITION PERFORMANCE IN THE DOMAIN ADAPTATION CHALLENGE USING DEEP NEURAL NETWORKS

被引：0

作者：

Garcia-Romero, Daniel ^{[1
]}

Zhang, Xiaohui

McCree, Alan

Povey, Daniel

机构：

[1] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA

来源：

2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014 | 2014年

关键词：

Unsupervised adaptation; speaker recognition; i-vectors; deep neural networks;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Traditional i-vector speaker recognition systems use a Gaussian mixture model (GMM) to collect sufficient statistics (SS). Recently, replacing this GMM with a deep neural network (DNN) has shown promising results. In this paper, we explore the use of DNNs to collect SS for the unsupervised domain adaptation task of the Domain Adaptation Challenge (DAC). We show that collecting SS with a DNN trained on out-of-domain data boosts the speaker recognition performance of an out-of-domain system by more than 25%. Moreover, we integrate the DNN in an unsupervised adaptation framework, that uses agglomerative hierarchical clustering with a stopping criterion based on unsupervised calibration, and show that the initial gains of the out-of-domain system carry over to the final adapted system. Despite the fact that the DNN is trained on the out-of-domain data, the final adapted system produces a relative improvement of more than 30% with respect to the best published results on this task.

引用

页码：378 / 383

页数：6

共 50 条

[41] TEXT-INDEPENDENT SPEAKER RECOGNITION USING NEURAL NETWORKS
HATTORI, H
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1993, E76D (03) : 345 - 351
[42] Biometric Speaker Recognition Using Neural Networks and Wavelet Transform
Daghbosheh, Mohammed
Hattab, Ezz
Bisher, Ahmad
2011 INTERNATIONAL CONFERENCE ON CIVIL ENGINEERING AND INFORMATION TECHNOLOGY (CEIT 2011), 2011, : 1 - 8
[43] Bi-Transferring Deep Neural Networks for Domain Adaptation
Zhou, Guangyou
Xie, Zhiwen
Huang, Jimmy Xiangji
He, Tingting
PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 322 - 332
[44] Using neural networks for automatic speaker recognition: A practical approach
Pinto, RGCP
Pinto, HLCP
Caloba, LP
38TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, PROCEEDINGS, VOLS 1 AND 2, 1996, : 1078 - 1080
[45] Improving the Robustness and Adaptability of sEMG-Based Pattern Recognition Using Deep Domain Adaptation
Shi, Ping
Zhang, Xinran
Li, Wei
Yu, Hongliu
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (11) : 5450 - 5460
[46] Speaker recognition using Radial Basis Function neural networks
Deng, JP
Venkateswarlu, R
HYBRID INFORMATION SYSTEMS, 2002, : 57 - 64
[47] Speaker recognition using dynamic synapse-neural networks
George, S
Dibazar, A
Berger, TW
SECOND JOINT EMBS-BMES CONFERENCE 2002, VOLS 1-3, CONFERENCE PROCEEDINGS: BIOENGINEERING - INTEGRATIVE METHODOLOGIES, NEW TECHNOLOGIES, 2002, : 151 - 152
[48] Learning Speaker Aware Offsets for Speaker Adaptation of Neural Networks
Sari, Leda
Thomas, Samuel
Hasegawa-Johnson, Mark A.
INTERSPEECH 2019, 2019, : 769 - 773
[49] Speaker Gender Recognition Based on Deep Neural Networks and ResNet50
Alnuaim, Abeer Ali
Zakariah, Mohammed
Shashidhar, Chitra
Hatamleh, Wesam Atef
Tarazi, Hussam
Shukla, Prashant Kumar
Ratna, Rajnish
WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
[50] DOMAIN ADAPTATION FOR SPEAKER RECOGNITION IN SINGING AND SPOKEN VOICE
Chowdhury, Anurag
Cozzo, Austin
Ross, Arun
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7192 - 7196

← 1 2 3 4 5 →