On Synthesis for Supervised Monaural Speech Separation in Time Domain

被引:3
|
作者
Chen, Jingjing [1 ]
Mao, Qirong [1 ,2 ]
Liu, Dons [1 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang, Jiangsu, Peoples R China
[2] Jiangsu Engn Res Ctr Big Data Ubiquitous Percept, Zhenjiang, Jiangsu, Peoples R China
来源
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
synthesis way; speech separation; time domain; deep learning;
D O I
10.21437/Interspeech.2020-1150
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Time-domain approaches for speech separation have achieved great success recently. However, the sources separated by these time-domain approaches usually contain some artifacts (broadband noises), especially when separating mixture with noise. In this paper, we incorporate synthesis way into the time-domain speech separation approaches to deal with above broadband noises in separated sources, which can be seamlessly used in the speech separation system by a 'plug-and-play' way. By directly learning an estimation for each source in encoded domain, synthesis way can reduce artifacts in estimated speeches and improve the speech separation performance. Extensive experiments on different state-of-the-art models reveal that the synthesis way acquires the ability to handle with noisy mixture and is more suitable for noisy speech separation. On a new benchmark noisy dataset, the synthesis way obtains 0.97 dB (10.1%) SDR relative improvement and respective gains on various metrics without extra computation cost.
引用
收藏
页码:2627 / 2631
页数:5
相关论文
共 50 条
  • [1] On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement
    Kolbaek, Morten
    Tan, Zheng-Hua
    Jensen, Soren Holdt
    Jensen, Jesper
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 825 - 838
  • [2] Investigation of Cost Function for Supervised Monaural Speech Separation
    Liu, Yun
    Zhang, Hui
    Zhang, Xueliang
    Cao, Yuhang
    INTERSPEECH 2019, 2019, : 3178 - 3182
  • [3] A Time-domain Monaural Speech Enhancement with Feedback Learning
    Li, Andong
    Zheng, Chengshi
    Cheng, Linjuan
    Peng, Renhua
    Li, Xiaodong
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 769 - 774
  • [4] A SUPERVISED LEARNING APPROACH FOR MONAURAL SPEECH SEGREGATION
    2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 1323 - 1326
  • [5] Monaural speech separation and recognition challenge
    Cooke, Martin
    Hershey, John R.
    Rennie, Steven J.
    COMPUTER SPEECH AND LANGUAGE, 2010, 24 (01): : 1 - 15
  • [6] EFFICIENT MONAURAL SPEECH SEPARATION WITH MULTISCALE TIME-DELAY SAMPLING
    Qian, Shuangqing
    Gao, Lijian
    Jia, Hongjie
    Mao, Qirong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6847 - 6851
  • [7] DEEP LEARNING FOR MONAURAL SPEECH SEPARATION
    Huang, Po-Sen
    Kim, Minje
    Hasegawa-Johnson, Mark
    Smaragdis, Paris
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [8] SUPERVISED MONAURAL SOURCE SEPARATION BASED ON AUTOENCODERS
    Osako, Keiichi
    Mitsufuji, Yuki
    Singh, Rita
    Raj, Bhiksha
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 11 - 15
  • [9] A supervised learning approach to monaural segregation of reverberant speech
    Jin, Zhaozhang
    Wang, DeLiang
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 921 - +
  • [10] A Supervised Learning Approach to Monaural Segregation of Reverberant Speech
    Jin, Zhaozhang
    Wang, DeLiang
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (04): : 625 - 638