A Unified Deep Neural Network for Speaker and Language Recognition

被引:0
|
作者
Richardson, Fred [1 ]
Reynolds, Doug [1 ]
Dehak, Najim [2 ]
机构
[1] MIT, Lincoln Lab, 244 Wood St, Lexington, MA 02173 USA
[2] MIT, CSAIL, Cambridge, MA USA
关键词
i-vector; DNN; bottleneck features; speaker recognition; language recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Significant performance gains have been reported separately for speaker recognition (SR) and language recognition (LR) tasks using either DNN posteriors of sub-phonetic units or DNN feature representations, but the two techniques have not been compared on the same SR or LR task or across SR and LR tasks using the same DNN. In this work we present the application of a single DNN for both tasks using the 2013 Domain Adaptation Challenge speaker recognition (DAC13) and the NIST 2011 language recognition evaluation (LRE11) benchmarks. Using a single DNN trained on Switchboard data we demonstrate large gains in performance on both benchmarks: a 55% reduction in EER for the DAC13 out-of-domain condition and a 48% reduction in C-avg on the LRE11 30s test condition. Score fusion and feature fusion are also investigated as is the performance of the DNN technologies at short durations for SR.
引用
收藏
页码:1146 / 1150
页数:5
相关论文
共 50 条
  • [21] Speech Recognition Model for Assamese Language Using Deep Neural Network
    Singh, Moirangthem Tiken
    Barman, Partha Pratim
    Gogoi, Rupjyoti
    2018 INTERNATIONAL CONFERENCE ON RECENT INNOVATIONS IN ELECTRICAL, ELECTRONICS & COMMUNICATION ENGINEERING (ICRIEECE 2018), 2018, : 2722 - 2727
  • [22] An Investigation of Deep Neural Network Architectures for Language Recognition in Indian Languages
    Mounika, K., V
    Achanta, Sivanand
    Lakshmi, H. R.
    Gangashetty, Suryakanth V.
    Vuppala, Anil Kumar
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2930 - 2933
  • [23] Ethiopian sign language recognition using deep convolutional neural network
    Bekalu Tadele Abeje
    Ayodeji Olalekan Salau
    Abreham Debasu Mengistu
    Nigus Kefyalew Tamiru
    Multimedia Tools and Applications, 2022, 81 : 29027 - 29043
  • [24] Benchmarking deep neural network approaches for Indian Sign Language recognition
    Sharma, Ashish
    Sharma, Nikita
    Saxena, Yatharth
    Singh, Anuraj
    Sadhya, Debanjan
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (12): : 6685 - 6696
  • [25] Bengali Sign Language Recognition Using Deep Convolutional Neural Network
    Hossen, M. A.
    Govindaiah, Arun
    Sultana, Sadia
    Bhuiyan, Alauddin
    2018 JOINT 7TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2018 2ND INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2018, : 369 - 373
  • [26] Ethiopian sign language recognition using deep convolutional neural network
    Abeje, Bekalu Tadele
    Salau, Ayodeji Olalekan
    Mengistu, Abreham Debasu
    Tamiru, Nigus Kefyalew
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (20) : 29027 - 29043
  • [27] A unified neural-network-based speaker localization technique
    Arslan, G
    Sakarya, EA
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2000, 11 (04): : 997 - 1002
  • [28] UNSUPERVISED SPEAKER ADAPTATION OF DEEP NEURAL NETWORK BASED ON THE COMBINATION OF SPEAKER CODES AND SINGULAR VALUE DECOMPOSITION FOR SPEECH RECOGNITION
    Xue, Shaofei
    Jiang, Hui
    Dai, Lirong
    Liu, Qingfeng
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4555 - 4559
  • [29] Deep Unified Model For Face Recognition Based on Convolution Neural Network and Edge Computing
    Khan, Muhammad Zeeshan
    Harous, Saad
    Ul Hassan, Saleet
    Khan, Muhammad Usman Ghani
    Iqbal, Razi
    Mumtaz, Shahid
    IEEE ACCESS, 2019, 7 : 72622 - 72633
  • [30] Proposing Two Speaker Adaptaion Methods for Deep Neural Network based Speech Recognition Systems
    Ansari, Zohreh
    Salehi, Seyyed Ali Seyyed
    2014 7TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2014, : 452 - 457