A Unified Deep Neural Network for Speaker and Language Recognition

被引:0
|
作者
Richardson, Fred [1 ]
Reynolds, Doug [1 ]
Dehak, Najim [2 ]
机构
[1] MIT, Lincoln Lab, 244 Wood St, Lexington, MA 02173 USA
[2] MIT, CSAIL, Cambridge, MA USA
关键词
i-vector; DNN; bottleneck features; speaker recognition; language recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Significant performance gains have been reported separately for speaker recognition (SR) and language recognition (LR) tasks using either DNN posteriors of sub-phonetic units or DNN feature representations, but the two techniques have not been compared on the same SR or LR task or across SR and LR tasks using the same DNN. In this work we present the application of a single DNN for both tasks using the 2013 Domain Adaptation Challenge speaker recognition (DAC13) and the NIST 2011 language recognition evaluation (LRE11) benchmarks. Using a single DNN trained on Switchboard data we demonstrate large gains in performance on both benchmarks: a 55% reduction in EER for the DAC13 out-of-domain condition and a 48% reduction in C-avg on the LRE11 30s test condition. Score fusion and feature fusion are also investigated as is the performance of the DNN technologies at short durations for SR.
引用
收藏
页码:1146 / 1150
页数:5
相关论文
共 50 条
  • [31] Deep Neural Network-Based Speech Recognition with Combination of Speaker-Class Models
    Kosaka, Tetsuo
    Konno, Kazuki
    Kato, Masaharu
    2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 1203 - 1206
  • [32] TIME DELAY DEEP NEURAL NETWORK-BASED UNIVERSAL BACKGROUND MODELS FOR SPEAKER RECOGNITION
    Snyder, David
    Garcia-Romero, Daniel
    Povey, Daniel
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 92 - 97
  • [33] Speaker recognition with a self-configuring neural network
    Lei, J
    Hall, LO
    1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 2351 - 2354
  • [34] A BAYESIAN ATTENTION NEURAL NETWORK LAYER FOR SPEAKER RECOGNITION
    Zhu, Weizhong
    Pelecanos, Jason
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6241 - 6245
  • [35] An efficient speaker recognition using quantum neural network
    Kaur, Rupinderdeep
    Sharma, R. K.
    Kumar, Parteek
    MODERN PHYSICS LETTERS B, 2018, 32 (31):
  • [36] Optimization of Multilayer Neural Network Parameters for Speaker Recognition
    Tovarek, Jaromir
    Partila, Pavol
    Rozhon, Jan
    Voznak, Miroslav
    Skapa, Jan
    Uhrin, Dominik
    Chmelikova, Zdenka
    MACHINE INTELLIGENCE AND BIO-INSPIRED COMPUTATION: THEORY AND APPLICATIONS X, 2016, 9850
  • [37] Speaker recognition method based on quantum neural network
    Wang, J.-M. (wjm_ice@163.com), 1600, University of Science and Technology (13):
  • [38] Neural Network Architectures for Speaker Independent Phoneme Recognition
    Cutajar, M.
    Gatt, E.
    Grech, I
    Casha, O.
    Micallef, J.
    PROCEEDINGS OF THE 7TH INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS (ISPA 2011), 2011, : 90 - 94
  • [39] Speaker independent voice recognition with a fuzzy neural network
    Nava, PA
    Taylor, JM
    FUZZ-IEEE '96 - PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 1996, : 2049 - 2052
  • [40] Speaker Recognition and Verification Using Artificial Neural Network
    Chauhan, Neha
    Chandra, Mahesh
    2017 2ND IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET), 2017, : 1147 - 1149