Speech Separation of A Target Speaker Based on Deep Neural Networks

被引:0
|
作者
Du Jun [1 ]
Tu Yanhui [1 ]
Xu Yong [1 ]
Dai Lirong [1 ]
Chin-Hui, Lee [2 ]
机构
[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
single-channel speech separation; supervised mode; semi-supervised mode; deep neural networks; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a novel data-driven approach based on deep neural networks (DNNs) for single-channel speech separation. DNN is adopted to directly model the highly non-linear relationship of speech features between a target speaker and the mixed signals. Both supervised and semi-supervised scenarios are investigated. In the supervised mode, both identities of the target speaker and the interfering speaker are provided. While in the semi-supervised mode, only the target speaker is given. We propose using multiple speakers to be mixed with the target speaker to train the DNN which is shown to well predict an unseen interferer in the separation stage. Experimental results demonstrate that our proposed framework achieves better separation results than a GMM-based approach in the supervised mode. More significantly, in the semi-supervised mode which is believed to be the preferred mode in real-world operations, the DNN-based approach even outperforms the GMMbased approach in the supervised mode.
引用
收藏
页码:473 / 477
页数:5
相关论文
共 50 条
  • [1] A UNIFIED SPEAKER-DEPENDENT SPEECH SEPARATION AND ENHANCEMENT SYSTEM BASED ON DEEP NEURAL NETWORKS
    Gao, Tian
    Du, Jun
    Xu, Li
    Liu, Cong
    Dai, Li-Rong
    Lee, Chin-Hui
    [J]. 2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 687 - 691
  • [2] ITERATIVE DEEP NEURAL NETWORKS FOR SPEAKER-INDEPENDENT BINAURAL BLIND SPEECH SEPARATION
    Liu, Qingju
    Xu, Yong
    Jackson, Philip J. B.
    Wang, Wenwu
    Coleman, Philip
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 541 - 545
  • [3] Binaural reverberant Speech separation based on deep neural networks
    Zhang, Xueliang
    Wang, DeLiang
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2018 - 2022
  • [4] Target Speech Signal Enhancement Based on Deep Neural Networks
    Zhang, Xin
    Wang, MingJiang
    Xuan, XiaoGuang
    Sun, FengJiao
    [J]. 2019 2ND IEEE INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND SIGNAL PROCESSING (ICICSP), 2019, : 241 - 245
  • [5] Speech Separation Based on Improved Deep Neural Networks with Dual Outputs of Speech Features for Both Target and Interfering Speakers
    Tu, Yanhui
    Du, Jun
    Xu, Yong
    Dai, Lirong
    Lee, Chin-Hui
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 250 - +
  • [6] Deep neural networks based binary classification for single channel speaker independent multi-talker speech separation
    Saleem, Nasir
    Khattak, Muhammad Irfan
    [J]. APPLIED ACOUSTICS, 2020, 167
  • [7] Deep neural networks for speaker verification with short speech utterances
    Yang, Il-Ho
    Heo, Hee-Soo
    Yoon, Sung-Hyun
    Yu, Ha-Jin
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2016, 35 (06): : 501 - 509
  • [8] Practical applicability of deep neural networks for overlapping speaker separation
    Appeltans, Pieter
    Zegers, Jeroen
    Van Hamme, Hugo
    [J]. INTERSPEECH 2019, 2019, : 1353 - 1357
  • [9] Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information
    Gu, Rongzhi
    Chen, Lianwu
    Zhang, Shi-Xiong
    Zheng, Jimeng
    Xu, Yong
    Yu, Meng
    Su, Dan
    Zou, Yuexian
    Yu, Dong
    [J]. INTERSPEECH 2019, 2019, : 4290 - 4294
  • [10] Deep Neural Networks with Batch Speaker Normalization for Intoxicated Speech Detection
    Wang, Weiqing
    Wu, Haiwei
    Li, Ming
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1323 - 1327