A Unified Deep Neural Network for Speaker and Language Recognition

被引：0

作者：

Richardson, Fred ^{[1
]}

Reynolds, Doug ^{[1
]}

Dehak, Najim ^{[2
]}

机构：

[1] MIT, Lincoln Lab, 244 Wood St, Lexington, MA 02173 USA

[2] MIT, CSAIL, Cambridge, MA USA

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

i-vector; DNN; bottleneck features; speaker recognition; language recognition;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Significant performance gains have been reported separately for speaker recognition (SR) and language recognition (LR) tasks using either DNN posteriors of sub-phonetic units or DNN feature representations, but the two techniques have not been compared on the same SR or LR task or across SR and LR tasks using the same DNN. In this work we present the application of a single DNN for both tasks using the 2013 Domain Adaptation Challenge speaker recognition (DAC13) and the NIST 2011 language recognition evaluation (LRE11) benchmarks. Using a single DNN trained on Switchboard data we demonstrate large gains in performance on both benchmarks: a 55% reduction in EER for the DAC13 out-of-domain condition and a 48% reduction in C-avg on the LRE11 30s test condition. Score fusion and feature fusion are also investigated as is the performance of the DNN technologies at short durations for SR.

引用

页码：1146 / 1150

页数：5

共 50 条

[31] Deep Neural Network-Based Speech Recognition with Combination of Speaker-Class Models
Kosaka, Tetsuo
Konno, Kazuki
Kato, Masaharu
2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1203 - 1206
[32] TIME DELAY DEEP NEURAL NETWORK-BASED UNIVERSAL BACKGROUND MODELS FOR SPEAKER RECOGNITION
Snyder, David
Garcia-Romero, Daniel
Povey, Daniel
2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 92 - 97
[33] Speaker recognition with a self-configuring neural network
Lei, J
Hall, LO
1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 2351 - 2354
[34] A BAYESIAN ATTENTION NEURAL NETWORK LAYER FOR SPEAKER RECOGNITION
Zhu, Weizhong
Pelecanos, Jason
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6241 - 6245
[35] An efficient speaker recognition using quantum neural network
Kaur, Rupinderdeep
Sharma, R. K.
Kumar, Parteek
MODERN PHYSICS LETTERS B, 2018, 32 (31):
[36] Optimization of Multilayer Neural Network Parameters for Speaker Recognition
Tovarek, Jaromir
Partila, Pavol
Rozhon, Jan
Voznak, Miroslav
Skapa, Jan
Uhrin, Dominik
Chmelikova, Zdenka
MACHINE INTELLIGENCE AND BIO-INSPIRED COMPUTATION: THEORY AND APPLICATIONS X, 2016, 9850
[37] Speaker recognition method based on quantum neural network
Wang, J.-M. (wjm_ice@163.com), 1600, University of Science and Technology (13):
[38] Neural Network Architectures for Speaker Independent Phoneme Recognition
Cutajar, M.
Gatt, E.
Grech, I
Casha, O.
Micallef, J.
PROCEEDINGS OF THE 7TH INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS (ISPA 2011), 2011, : 90 - 94
[39] Speaker independent voice recognition with a fuzzy neural network
Nava, PA
Taylor, JM
FUZZ-IEEE '96 - PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 1996, : 2049 - 2052
[40] Speaker Recognition and Verification Using Artificial Neural Network
Chauhan, Neha
Chandra, Mahesh
2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2017, : 1147 - 1149

← 1 2 3 4 5 →