A Unified Deep Neural Network for Speaker and Language Recognition

被引:0
|
作者
Richardson, Fred [1 ]
Reynolds, Doug [1 ]
Dehak, Najim [2 ]
机构
[1] MIT, Lincoln Lab, 244 Wood St, Lexington, MA 02173 USA
[2] MIT, CSAIL, Cambridge, MA USA
关键词
i-vector; DNN; bottleneck features; speaker recognition; language recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Significant performance gains have been reported separately for speaker recognition (SR) and language recognition (LR) tasks using either DNN posteriors of sub-phonetic units or DNN feature representations, but the two techniques have not been compared on the same SR or LR task or across SR and LR tasks using the same DNN. In this work we present the application of a single DNN for both tasks using the 2013 Domain Adaptation Challenge speaker recognition (DAC13) and the NIST 2011 language recognition evaluation (LRE11) benchmarks. Using a single DNN trained on Switchboard data we demonstrate large gains in performance on both benchmarks: a 55% reduction in EER for the DAC13 out-of-domain condition and a 48% reduction in C-avg on the LRE11 30s test condition. Score fusion and feature fusion are also investigated as is the performance of the DNN technologies at short durations for SR.
引用
收藏
页码:1146 / 1150
页数:5
相关论文
共 50 条
  • [41] On a Unified Deep Neural Network Decoding Architecture
    Artemasov, Dmitry
    Andreev, Kirill
    Frolov, Alexey
    2023 IEEE 98TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-FALL, 2023,
  • [42] SPEAKER DIARIZATION USING DEEP NEURAL NETWORK EMBEDDINGS
    Garcia-Romero, Daniel
    Snyder, David
    Sell, Gregory
    Povey, Daniel
    McCree, Alan
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4930 - 4934
  • [43] Indian Sign Language Gesture Recognition Using Deep Convolutional Neural Network
    Varsha, M.
    Nair, Chitra S.
    2021 8TH INTERNATIONAL CONFERENCE ON SMART COMPUTING AND COMMUNICATIONS (ICSCC), 2021, : 193 - 197
  • [44] I-vector features and deep neural network modeling for language recognition
    Wang, Wei
    Song, Wenjie
    Chen, Chen
    Zhang, Zhaoxin
    Xin, Yi
    2018 INTERNATIONAL CONFERENCE ON IDENTIFICATION, INFORMATION AND KNOWLEDGE IN THE INTERNET OF THINGS, 2019, 147 : 36 - 43
  • [45] Recognition of Bengali Sign Language using Novel Deep Convolutional Neural Network
    Hossein, Md Jahangir
    Ejaz, Md Sabbir
    2020 2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR INDUSTRY 4.0 (STI), 2020,
  • [46] Ensemble Speaker Modeling using Speaker Adaptive Training Deep Neural Network for Speaker Adaptation
    Li, Sheng
    Lu, Xugang
    Akita, Yuya
    Kawahara, Tatsuya
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2892 - 2896
  • [47] An Enhanced Deep Neural Network-Based Approach for Speaker Recognition Using Triumvirate Euphemism Strategy
    Pedalanka, P. S. Subhashini
    Ram, Manchikalapudi Satya Sai
    Rao, Duggirala Sreenivasa
    INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2024, 24 (01)
  • [48] Discriminative Learning of Filterbank Layer within Deep Neural Network Based Speech Recognition for Speaker Adaptation
    Seki, Hiroshi
    Yamamoto, Kazumasa
    Akiba, Tomoyosi
    Nakagawa, Seiichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (02) : 364 - 374
  • [49] Deep neural network based forensic automatic speaker recognition in VOCALISE using x-vectors
    Kelly, Finnian
    Forth, Oscar
    Kent, Samuel
    Gerlach, Linda
    Alexander, Anil
    2019 AES INTERNATIONAL CONFERENCE ON AUDIO FORENSICS, 2019,
  • [50] A Pseudo-task Design in Multi-task Learning Deep Neural Network for Speaker Recognition
    Lu, Xugang
    Shen, Peng
    Tsao, Yu
    Kawai, Hisashi
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,