Single Channel Speech Source Separation Using Hierarchical Deep Neural Networks

被引:0
|
作者
Noorani, Seyed Majid [1 ]
Seyedin, Sanaz [1 ]
机构
[1] Amirkabir Univ Technol, Dept Elect Engn, Tehran, Iran
关键词
Speech source separation; Deep neural networks; Time-frequency masks; BLIND SEPARATION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Single-channel speech source separation is a well known task for preparing speech signals for some applications like speech recognition and enhancement. In this paper, we introduce a novel design for separating sources with the help of hierarchical deep neural networks and time-frequency masks. The proposed method classifies the mixture signals in three categories based on the mixed genders in the first hierarchy. Thus, three other networks, each for a specific mixture type, use these categorized data for speech separation. Then, an enhancement stage improves the quality of voices considering an improved cost function that reduces the interference of the estimated sources of the previous stage. The demanded data is gathered from TSP corpus and the output of the systems have been evaluated with different metrics such as signal to distortion ratio (SDR), signal to interference ratio (SIR) and Perceptual evaluation of speech quality (PESQ). Comparing with other methods, the proposed architecture works considerably better and the results are outstanding.
引用
收藏
页码:466 / 470
页数:5
相关论文
共 50 条
  • [41] Estimation of the glottal source from coded telephone speech using deep neural networks
    Narendra, N. P.
    Airaksinen, Manu
    Story, Brad
    Alku, Paavo
    [J]. SPEECH COMMUNICATION, 2019, 106 : 95 - 104
  • [42] Source separation using single channel ICA
    Davies, M. E.
    James, C. J.
    [J]. SIGNAL PROCESSING, 2007, 87 (08) : 1819 - 1832
  • [43] IMPACT OF LOW-PRECISION DEEP REGRESSION NETWORKS ON SINGLE-CHANNEL SOURCE SEPARATION
    Ceolini, Enea
    Liu, Shih-Chii
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 256 - 260
  • [44] Unsupervised Single-Channel Speech Separation via Deep Neural Network for Different Gender Mixtures
    Wang, Yannan
    Du, Jun
    Dai, Li-Rong
    Lee, Chin-Hui
    [J]. 2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [45] Separation and deconvolution of speech using recurrent neural networks
    Li, Y
    Powers, D
    Wen, P
    [J]. IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, 2001, : 1303 - 1309
  • [46] Group Delay based Music Source Separation using Deep Recurrent Neural Networks
    Sebastian, Jilt
    Murthy, Hema A.
    [J]. 2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
  • [47] BioCPPNet: automatic bioacoustic source separation with deep neural networks
    Bermant, Peter C.
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [48] BioCPPNet: automatic bioacoustic source separation with deep neural networks
    Peter C. Bermant
    [J]. Scientific Reports, 11
  • [49] Source-Filter-Based Single-Channel Speech Separation Using Pitch Information
    Stark, Michael
    Wohlmayr, Michael
    Pernkopf, Franz
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (02): : 242 - 255
  • [50] UNIVERSAL SPEECH MODELS FOR SPEAKER INDEPENDENT SINGLE CHANNEL SOURCE SEPARATION
    Sun, Dennis L.
    Mysore, Gautham J.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 141 - 145