SOURCE-AWARE CONTEXT NETWORK FOR SINGLE-CHANNEL MULTI-SPEAKER SPEECH SEPARATION

被引:0
|
作者
Li, Zeng-Xi [1 ]
Song, Yan [1 ]
Dai, Li-Rong [1 ]
McLoughlin, Ian [2 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Anhui, Peoples R China
[2] Univ Kent, Sch Comp, Medway, England
基金
中国国家自然科学基金;
关键词
Speech Separation; Deep Learning; Label Permutation Problem; NEURAL-NETWORKS; DEEP;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep learning based approaches have achieved promising performance in speaker-dependent single-channel multi-speaker speech separation. However, partly due to the label permutation problem, they may encounter difficulties in speaker-independent conditions. Recent methods address this problem by some assignment operations. Different from them, we propose a novel source-aware context network, which explicitly inputs speech sources as well as mixture signal. By exploiting the temporal dependency and continuity of the same source signal, the permutation order of outputs can be easily determined without any additional post-processing. Furthermore, a Multi-time-step Prediction Training strategy is proposed to address the mismatch between training and inference stages. Experimental results on benchmark WSJ0-2mix dataset revealed that our network achieved comparable or better results than state-of-the-art methods in both closed-set and open-set conditions, in terms of Signal-to-Distortion Ratio (SDR) improvement.
引用
收藏
页码:681 / 685
页数:5
相关论文
共 50 条
  • [41] WHAMR!: NOISY AND REVERBERANT SINGLE-CHANNEL SPEECH SEPARATION
    Maciejewski, Matthew
    Wichern, Gordon
    McQuinn, Emmett
    Le Roux, Jonathan
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 696 - 700
  • [42] Learning a Discriminative Dictionary for Single-Channel Speech Separation
    Bao, Guangzhao
    Xu, Yangfei
    Ye, Zhongfu
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (07) : 1130 - 1138
  • [43] Single-Channel Speech Separation Focusing on Attention DE
    Li, Xinshu
    Tan, Zhenhua
    Xia, Zhenche
    Wu, Danke
    Zhang, Bin
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3204 - 3209
  • [44] Improved Phase Reconstruction in Single-Channel Speech Separation
    Mayer, Florian
    Mowlaee, Pejman
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1795 - 1799
  • [45] LEARNING A HIERARCHICAL DICTIONARY FOR SINGLE-CHANNEL SPEECH SEPARATION
    Bao, Guangzhao
    Xu, Yangfei
    Xu, Xu
    Ye, Zhongfu
    [J]. 2014 IEEE WORKSHOP ON STATISTICAL SIGNAL PROCESSING (SSP), 2014, : 476 - 479
  • [46] Optimum Mixture Estimator for single-channel Speech Separation
    Mowlaee, Pejman
    Sayadiyan, Abolghassem
    Sheikhan, Mansour
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 543 - +
  • [47] Single-channel speech separation based on modulation frequency
    Gu, Lingyun
    Stern, Richard M.
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 25 - 28
  • [48] Single-channel blind source separation based on attentional generative adversarial network
    Sun, Xiao
    Xu, Jindong
    Ma, Yongli
    Zhao, Tianyu
    Ou, Shifeng
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2022, 13 (03) : 1443 - 1450
  • [49] Single-channel blind source separation based on attentional generative adversarial network
    Xiao Sun
    Jindong Xu
    Yongli Ma
    Tianyu Zhao
    Shifeng Ou
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 1443 - 1450
  • [50] ONLINE DEEP ATTRACTOR NETWORK FOR REAL-TIME SINGLE-CHANNEL SPEECH SEPARATION
    Han, Cong
    Luo, Yi
    Mesgarani, Nima
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 361 - 365