Monaural noisy speech separation combining sparse non-negative matrix factorization and deep attractor network

被引:0
|
作者
GE Wanying [1 ]
ZHANG Tianqi [1 ]
FAN Congcong [1 ]
ZHANG Tian [1 ]
机构
[1] School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications
基金
中国国家自然科学基金;
关键词
D O I
10.15949/j.cnki.0217-9776.2021.02.008
中图分类号
TN912.3 [语音信号处理];
学科分类号
0711 ;
摘要
The performance of the monaural speech separation method is limited when the speech mixture is disordered by background noise.To obtain the enhanced separated speech from the noisy mixture,a monaural noisy speech separation method combining sparse nonnegative matrix factorization(SNMF) and deep attractor network(DANet) is proposed.This method firstly decomposes the noisy mixture into coefficients of speech and noise respectively.Then the speech coefficient is projected to a high-dimensional embedding space and a DANet is trained to force the embeddings to move to different clusters.The attractor points are used to separate the speech coefficients by masking method,and finally the enhanced separated speeches are reconstructed by the speech basis and their corresponding coefficients.Experimental results in various background noise environments show that the proposed algorithm effectively suppress the noises without decreasing the quality of reconstructed speech by comparison with different baseline methods.
引用
收藏
页码:266 / 280
页数:15
相关论文
共 50 条
  • [1] Monaural noisy speech separation combining sparse non-negative matrix factorization and deep attractor network
    Ge, Wanying
    Zhang, Tianqi
    Fan, Congcong
    Zhang, Tian
    Shengxue Xuebao/Acta Acustica, 2021, 46 (01): : 55 - 66
  • [2] Single-Channel Speech Separation using Sparse Non-Negative Matrix Factorization
    Schmidt, Mikkel N.
    Olsson, Rasmus K.
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2614 - 2617
  • [3] Convolutive Sparse Non-negative Matrix Factorization for Windy Speech
    Lai Xiaoqiang
    Li Shuangtian
    Yang Jie
    2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 494 - 497
  • [4] Separation of Reflection Components by Sparse Non-negative Matrix Factorization
    Akashi, Yasuhiro
    Okatani, Takayuki
    COMPUTER VISION - ACCV 2014, PT V, 2015, 9007 : 611 - 625
  • [5] Separation of reflection components by sparse non-negative matrix factorization
    Akashi, Yasushi
    Okatani, Takayuki
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2016, 146 : 77 - 85
  • [6] Improvement in monaural speech separation using sparse non-negative tucker decomposition
    Varshney, Yash Vardhan
    Upadhyaya, Prashant
    Abbasi, Zia Ahmad
    Abidi, Musiur Raza
    Farooq, Omar
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (04) : 837 - 849
  • [7] Deep Attractor with Convolutional Network for Monaural Speech Separation
    Lan, Tian
    Qian, Yuxin
    Tai, Wenxin
    Chu, Boce
    Liu, Qiao
    2020 11TH IEEE ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON), 2020, : 40 - 44
  • [8] Robust Non-negative Matrix Factorization with β-Divergence for Speech Separation
    Li, Yinan
    Zhang, Xiongwei
    Sun, Meng
    ETRI JOURNAL, 2017, 39 (01) : 21 - 29
  • [9] Multi-Stage Non-Negative Matrix Factorization for Monaural Singing Voice Separation
    Zhu, Bilei
    Li, Wei
    Li, Ruijiang
    Xue, Xiangyang
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10): : 2096 - 2107
  • [10] Multiobjective Sparse Non-Negative Matrix Factorization
    Gong, Maoguo
    Jiang, Xiangming
    Li, Hao
    Tan, Kay Chen
    IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (08) : 2941 - 2954